Announcing Pixtral 12B

https://mistral.ai/news/pixtral-12b(mistral.ai)

Mistral AI announced Pixtral 12B, its first natively multimodal model trained on interleaved image and text data. The model features a 12B parameter decoder and a new 400M parameter vision encoder, released under an Apache 2.0 license. It demonstrates strong performance on multimodal reasoning benchmarks, surpassing many larger models, while maintaining high performance on text-only tasks. The architecture supports variable image sizes and multiple images within a 128k token context window, excelling at chart analysis, document Q&A, and instruction following.

0 points•by ogg•4 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?