Tamil G in AIGoogleNews

Google rolls out Gemini Embedding 2 for multimodal AI applications

Google has released Gemini Embedding 2, a multimodal embedding model built on the Gemini architecture.

The model expands beyond earlier text-only embedding systems by mapping text, images, videos, audio, and documents into a single unified embedding space. It captures semantic meaning across more than 100 languages and supports AI tasks such as Retrieval-Augmented Generation (RAG), semantic search, sentiment analysis, and data clustering.

Gemini Embedding 2

Gemini Embedding 2 uses the multimodal capabilities of the Gemini architecture to generate embeddings from different types of data.

The model supports interleaved multimodal inputs, allowing developers to combine inputs such as text and images in a single request. This enables the system to capture relationships between different media types and process datasets that contain multiple formats.

Key features

Multimodal input support

Text: Supports up to 8,192 input tokens
Images: Processes up to six images per request, supporting PNG and JPEG formats
Videos: Supports video input of up to 120 seconds in MP4 and MOV formats
Audio: Directly processes audio without requiring transcription
Documents: Supports embedding PDF files up to six pages

Interleaved multimodal inputs

The model can process multiple media types within a single request, enabling contextual understanding between inputs such as image and text.

Matryoshka Representation Learning (MRL)

Gemini Embedding 2 incorporates Matryoshka Representation Learning, which allows embedding vectors to scale across different dimensions. The default dimension is 3,072, and developers can reduce the size to manage storage and performance requirements.

Recommended output dimensions:

3,072
1,536
768

Model capabilities

According to Google, the model introduces multimodal embedding support across text, image, video, and speech tasks, while adding native audio processing capability.

Supported use cases

Retrieval-Augmented Generation (RAG)
Semantic search
Sentiment analysis
Data clustering
Large-scale data management

Availability

Gemini Embedding 2 is available in Public Preview through the Gemini API and Vertex AI. Developers can access the model through integrations with frameworks and vector database tools including:

LangChain
LlamaIndex
Haystack
Weaviate
Qdrant
ChromaDB

The model can also be used with vector search systems for multimodal data processing.

Next Read: Acer TravelMate P4 and P2 Copilot+ PCs with Intel Core Ultra Series 3 processors with Intel vPro announced »

AI embeddingsGemini APIGemini Embedding 2Google AI modelGoogle Gemini Embedding 2Matryoshka Representation Learningmultimodal AI modelmultimodal embedding modelRAG AIRetrieval Augmented Generationsemantic search AIvector embeddingsVertex AI

Tamil G:

Nothing Phone (4a) Pro: Exclusive retail drop at Nothing Store in Bengaluru on March 21
Nothing introduced the Nothing Phone (4a)-series smartphones on March 5th at its London event, and…
Amazon Business reports Rs. 2,000 Crore in value for Indian enterprises in 2025
Amazon Business has announced that its platform enabled businesses across India to realize over Rs.…
Lava Bold 2 5G with 6.67″ FHD+ 120Hz AMOLED display launched
Lava has launched the Bold 2 5G, the company’s latest 5G smartphone, as the successor…