Blog

Jan 22, 2025

T-Free: Hierarchical Autoregressive Transformers for Language Fairness and Sovereignty

In this blog post, we want to take a closer look at a tokenizer-free approach, which we proposed in a recent paper and termed Hierarchical Autoregressive Transformers (HAT). In particular, we want to showcase how such a model can be pre-trained in English and efficiently adapted to learn a new, previously unseen language.

Blog Research

Oct 29, 2024

In awe at the scale of these tensors – a gentle introduction to Unit-Scaled Maximal Update Parametrization

Together with Graphcore, we recently developed u-μP as a new paradigm to parametrize neural networks in terms of width and depth. Our approach combines μP, developed by G. Yang et. al., with Unit Scaling, a concept introduced by Graphcore.

Blog Research

Aug 29, 2024

Words don’t come easy (… to LLMs): Universal Text-Encoding for dynamic, multi-lingual alphabets revolutionizing efficiency and effectiveness for LLM training and inference

The remarkable advancements of Large Language Models (LLMs) frequently capture attention as they become valuable collaborators in daily situations, all while progressing towards breakthroughs beyond simple language completion.

Blog Research

Aug 26, 2024

Introducing Pharia-1-LLM: transparent and compliant

We are pleased to announce our new foundation model family that includes Pharia-1-LLM-7B-control and Pharia-1-LLM-7B-control-aligned, now publicly available under the Open Aleph License, which explicitly allows for non-commercial research and educational use.

Blog Research

Aug 26, 2024

Open-sourcing Codebase Scaling for Non-commercial Research

Aleph Alpha's model training codebase Scaling is publicly available under the Open Aleph License, which explicitly allows for non-commercial research and educational use. Scaling was used to develop our concurrently released new models Pharia-1-LLM-control and Pharia-1-LLM-control-aligned.

Blog Research

Jun 1, 2023

Quality Diversity through AI Feedback

Language models carry implicit distributional biases based on their training data, which can reinforce existing norms. In this work, we take one step towards addressing the challenge of unwanted biases by enabling language models to return outputs with a broader spectrum of attribute traits, specified by a user. This is achieved by asking language models to evaluate and modify their outputs.

Blog Research

Feb 20, 2023

Luminous Performance Benchmarks

The research compares Luminous to the models from GPT-3 and ChatGPT developer OpenAI, among others. The scientific comparison included tasks related to text classification, evaluation, and generation, as well as answering questions about specific text contents. The result is impressive – with Luminous, a European AI language model is, for the first time, on par with the world's leading AI language models, while being much more efficient.

Blog Research

Oct 17, 2022

Aleph Alpha and Graphcore demonstrate 80% sparsified AI Model

Aleph Alpha and our partner Graphcore are unveiling a significant advance in AI compute efficiency, with the sparsification of a 13bn parameter model down to just 2.6bn parameters.

Blog Research

Sep 23, 2022

Luminous-Explore – A model for world-class semantic representation

AI becomes meaningfully more capable almost every month. With this new fidelity and transformative use-cases, society is rethinking the human-machine-collaboration and ethical alignment.

Blog

Aug 23, 2022

Alignment and emotions for humans and AI: If we want to trust AI, we may need to hurt it

AI becomes meaningfully more capable almost every month. With this new fidelity and transformative use-cases, society is rethinking the human-machine-collaboration and ethical alignment.