0

Falcon Perception

https://huggingface.co/blog/tiiuae/falcon-perception(huggingface.co)
Falcon Perception is a 0.6B-parameter early-fusion Transformer model for open-vocabulary grounding and segmentation from text prompts. It processes image patches and text in a unified sequence using a hybrid attention mask, avoiding traditional pipeline architectures. The model uses a "Chain-of-Perception" technique to generate instance outputs by predicting coordinates, size, and finally a segmentation mask. The release also includes Falcon OCR, a specialized model for document understanding, and PBench, a new benchmark for evaluating complex perception capabilities.
0 pointsby ogg1 day ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?