Publications

Txt360: A top-quality llm pre-training dataset requires the perfect blend