#scaling 共 7 个条目 讲座 (2) L07: Pretraining L19: Open Questions in NLP 论文 (5) Language Models are Few-Shot Learners Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Model Parameters Language Models are Unsupervised Multitask Learners The Llama 3 Herd of Models Scaling Instruction-Finetuned Language Models