#稀疏架构 共 1 个条目 论文 (1) Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models