0

Introducing Jamba: AI21’s Groundbreaking SSM-Transformer Model

https://www.ai21.com/blog/announcing-jamba/(www.ai21.com)
AI21 Labs has introduced Jamba, a new production-grade model featuring a novel hybrid architecture that combines Mamba Structured State Space (SSM) technology with elements of the traditional Transformer. This design aims to overcome the limitations of pure SSM or Transformer models, offering significant gains in throughput and efficiency. Jamba features a 256K context window, achieves three times the throughput on long contexts compared to models like Mixtral 8x7B, and can fit up to 140K context on a single GPU. The model has been released with open weights under an Apache 2.0 license to encourage community development and optimization.
0 pointsby chrisf2 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?