
Mixture of Experts
/ˈmɪkstʃər əv ˈekspɜːrts/
using multiple specialized sub-models (experts) and routing tokens to them
“Mixture of Experts (MoE) scales capacity without increasing inference cost.”
Origin: Machine Learning term (Jacobs et al., 1991)

/ˈmɪkstʃər əv ˈekspɜːrts/
using multiple specialized sub-models (experts) and routing tokens to them
“Mixture of Experts (MoE) scales capacity without increasing inference cost.”
Origin: Machine Learning term (Jacobs et al., 1991)