
T-Free: Hierarchical Autoregressive Transformers for Language Fairness and Sovereignty
In this blog post, we want to take a closer look at a tokenizer-free approach, which we proposed in a recent paper and termed Hierarchical Autoregressive Transformers (HAT). In particular, we want to showcase how such a model can be pre-trained in English and efficiently adapted to learn a new, previously unseen language.