A 7B parameter language model, distinctively trained on the SlimPajama and StarCoder datasets, eclipsing the Llama 2 frontier, skillfully balances language and coding. Its instruction-following variant, CrystalChat, stands out as a top-scoring 7B chat model, trained on a carefully selected mix publicly available language and code datasets.
The Research Suite is a comprehensive set of large language model (LLM) artifacts from each of our models, for academic and industry researchers to explore LLM training dynamics.
The Pretraining Suite is a series of step-by-step guides to reproduce each of our models, for tech enthusiasts, AI practitioners, and academic or industry researchers, to transfer knowledge on LLM pretraining techniques.
The Developer Suite is a series of fine-tuning and inference tutorials for tech enthusiasts, AI practitioners, and academic or industry researchers, who are interested in general model usage or downstream task evaluation and research.
In this paper, we present LLM360 K2-65B, the most powerful fully transparent open-source large language model (LLM) released to date. K2 is a 65 billion parameter LLM, which follows best practices for reproducibility from the LLM360 project. Despite numerous efforts to develop and release open-source LLMs, full transparency around the training process still remains limited...
The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics...
LLM360 is excited to announce several new releases to further our mission enabling community-owned AGI by creating standards and tools to advance the bleeding edge of LLM capability and empower knowledge transfer, research, and development.
In recent months, the open-source large language model (LLM) community has seen tremendous model contributions. However, model weight releases and overview technical reports do not contain enough information to cover the complexity of LLM training, which hinders openness and transparency, the mechanisms behind trustworthy and innovative research and science for decades.
The LLM360 team is here to solve the most challenging AI problems. Reach out if you'd like to discuss.