OpenAI rolls out ChatGPT Images 2.0 with improved text rendering, multilingual support, and thinking capabilities

OpenAI has introduced ChatGPT Images 2.0, a next-generation image generation model designed to produce precise, structured, and usable visual outputs. The model is built to handle complex visual tasks, improve instruction accuracy, and generate images that better reflect real-world design needs.

ChatGPT Images 2.0

ChatGPT Images 2.0 is designed to improve how AI interprets and executes image prompts, focusing on accuracy, structure, and visual consistency. It can place objects more precisely, follow detailed instructions, and maintain coherent layouts across complex scenes.

The model is available across ChatGPT, Codex, and the API, and introduces major upgrades in composition control, text rendering, multilingual support, and visual consistency.

A key addition is thinking capability, where the model can reason through prompts before generating images. In supported modes, it can also use web information, generate multiple image outputs from a single request, and refine results for better consistency.

This shifts the system from basic image generation to a more structured visual reasoning tool.

Key Features

Instruction following improvements: Handles complex prompts with higher accuracy, preserving structured requirements and fine details.
Enhanced composition control: Improves placement of objects, UI elements, and design components across layouts with better spatial consistency.

Improved text rendering: Generates clearer and more accurate text inside images, including small fonts, labels, and dense typography.
Stronger multilingual support: Improves rendering of non-English languages, including Japanese, Korean, Chinese, Hindi, and Bengali, with better readability and structure.
Improved style realism: Produces more consistent outputs across photorealism, cinematic visuals, manga, pixel art, and illustration styles.
Flexible aspect ratio support: Supports wide and tall formats suitable for banners, posters, mobile layouts, and social content.

Thinking capabilities: Allows the model to reason before generating images, use web search for context, and refine outputs for better consistency.
Multi-image generation: Generates multiple related images from a single prompt while maintaining continuity across outputs.
Design-oriented outputs: Optimized for real-world use cases such as UI mockups, marketing visuals, educational diagrams, and product concepts.

Safety Overview

ChatGPT Images 2.0 uses a multi-layer safety system to manage image generation responsibly.

Prompt-level filters block unsafe requests before generation
Output-level checks review images before display
A safety reasoning model monitors both inputs and outputs
Continuous evaluation improves detection and enforcement

These safeguards are designed to reduce harmful or policy-violating outputs across all stages of generation.

Limitations

Despite improvements, the model still has limitations:

Reduced accuracy in complex physical or step-based logic scenes
Inconsistencies in highly dense or repetitive visual patterns
Occasional errors in technical labels, arrows, or structured diagrams
Challenges with hidden, angled, or reversed visual details
Some multilingual or diagram-heavy outputs may require review in specific cases

Pricing and Availability

Available in ChatGPT, Codex, and API with basic image generation available to all users

Advanced thinking capabilities available for Plus, Pro, and Business users
API access provided through gpt-image-2, with pricing based on resolution and quality
Supports up to 2K resolution outputs in API use cases