Allocating large vocabulary capacity for cross-lingual language model pre-training
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Zheng, Bo and Dong, Li and Huang, Shaohan and Singhal, Saksham and Che, Wanxiang and Liu, Ting and Song, Xia and Wei, Furu