Research

Research

In awe at the scale of these tensors – a gentle introduction to Unit-Scaled Maximal Update Parametrization

Together with Graphcore, we recently developed u-μP as a new paradigm to parametrize neural networks in terms of width and depth. Our approach combines μP, developed by G. Yang et. al., with Unit Scaling, a concept introduced by Graphcore.
Mehr lesen