Granite 4.1 LLMs: How They’re Built

https://huggingface.co/blog/ibm-granite/granite-4-1(huggingface.co)

IBM's Granite 4.1 is a family of dense, decoder-only LLMs available in 3B, 8B, and 30B parameter sizes. The models undergo a five-phase pre-training process on approximately 15 trillion tokens, which progressively refines the data mixture from general web content to high-quality math, code, and instruction data. This pre-training is followed by supervised fine-tuning and a multi-stage reinforcement learning pipeline to enhance performance. The training process also includes a long-context extension phase, enabling the models to handle context windows up to 512K tokens.

0 points•by ogg•1 hour ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?