0
Qwen-Image: Crafting with Native Text Rendering
https://simonwillison.net/2025/Aug/4/qwen-image/#atom-everything(simonwillison.net)Qwen has released its first image generation model, Qwen-Image, a 20 billion parameter Multimodal Diffusion Transformer with an Apache 2.0 license. The model's training process heavily emphasized native text rendering, using synthesized data and programmatic editing of templates. To create its training data, the team utilized their Qwen-2.5-VL vision LLM to generate comprehensive image descriptions and structured metadata. A text-to-image version is available, with an image editing model planned for a future release.
0 points•by ogg•2 months ago