Model comparison
- We can read about the available models and their capabilities on the model overview page
- The table below is extracted from Kaggle notebook as part of Google-5-Day-Gen-AI-Intensive-Course. As of writing this note (on 23-May-2025),
- It had 56 models under the hood.
- Some of the models support an
input_token_limit
of2000000
output_token_limit
of65536
- Different actions supported are
createTunedTextModel
generateContent
generateMessage
predict
createTunedModel
countTokens
createCachedContent
bidiGenerateContent
embedText
embedContent
countTextTokens
countMessageTokens
generateAnswer
name | display_name | description | version | input_token_limit | output_token_limit | supported_actions |
---|---|---|---|---|---|---|
models/chat-bison-001 | PaLM 2 Chat (Legacy) | A legacy text-only model optimized for chat conversations | 001 | 4096 | 1024 | [‘generateMessage’, ‘countMessageTokens’] |
models/text-bison-001 | PaLM 2 (Legacy) | A legacy model that understands text and generates text as an output | 001 | 8196 | 1024 | [‘generateText’, ‘countTextTokens’, ‘createTunedTextModel’] |
models/embedding-gecko-001 | Embedding Gecko | Obtain a distributed representation of a text. | 001 | 1024 | 1 | [‘embedText’, ‘countTextTokens’] |
models/gemini-1.0-pro-vision-latest | Gemini 1.0 Pro Vision | The original Gemini 1.0 Pro Vision model version which was optimized for image understanding. Gemini 1.0 Pro Vision was deprecated on July 12, 2024. Move to a newer Gemini version. | 001 | 12288 | 4096 | [‘generateContent’, ‘countTokens’] |
models/gemini-pro-vision | Gemini 1.0 Pro Vision | The original Gemini 1.0 Pro Vision model version which was optimized for image understanding. Gemini 1.0 Pro Vision was deprecated on July 12, 2024. Move to a newer Gemini version. | 001 | 12288 | 4096 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-pro-latest | Gemini 1.5 Pro Latest | Alias that points to the most recent production (non-experimental) release of Gemini 1.5 Pro, our mid-size multimodal model that supports up to 2 million tokens. | 001 | 2000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-pro-001 | Gemini 1.5 Pro 001 | Stable version of Gemini 1.5 Pro, our mid-size multimodal model that supports up to 2 million tokens, released in May of 2024. | 001 | 2000000 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-1.5-pro-002 | Gemini 1.5 Pro 002 | Stable version of Gemini 1.5 Pro, our mid-size multimodal model that supports up to 2 million tokens, released in September of 2024. | 002 | 2000000 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-1.5-pro | Gemini 1.5 Pro | Stable version of Gemini 1.5 Pro, our mid-size multimodal model that supports up to 2 million tokens, released in May of 2024. | 001 | 2000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-latest | Gemini 1.5 Flash Latest | Alias that points to the most recent production (non-experimental) release of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks. | 001 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-001 | Gemini 1.5 Flash 001 | Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in May of 2024. | 001 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-1.5-flash-001-tuning | Gemini 1.5 Flash 001 Tuning | Version of Gemini 1.5 Flash that supports tuning, our fast and versatile multimodal model for scaling across diverse tasks, released in May of 2024. | 001 | 16384 | 8192 | [‘generateContent’, ‘countTokens’, ‘createTunedModel’] |
models/gemini-1.5-flash | Gemini 1.5 Flash | Alias that points to the most recent stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks. | 001 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-002 | Gemini 1.5 Flash 002 | Stable version of Gemini 1.5 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in September of 2024. | 002 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-1.5-flash-8b | Gemini 1.5 Flash-8B | Stable version of Gemini 1.5 Flash-8B, our smallest and most cost effective Flash model, released in October of 2024. | 001 | 1000000 | 8192 | [‘createCachedContent’, ‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-8b-001 | Gemini 1.5 Flash-8B 001 | Stable version of Gemini 1.5 Flash-8B, our smallest and most cost effective Flash model, released in October of 2024. | 001 | 1000000 | 8192 | [‘createCachedContent’, ‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-8b-latest | Gemini 1.5 Flash-8B Latest | Alias that points to the most recent production (non-experimental) release of Gemini 1.5 Flash-8B, our smallest and most cost effective Flash model, released in October of 2024. | 001 | 1000000 | 8192 | [‘createCachedContent’, ‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-8b-exp-0827 | Gemini 1.5 Flash 8B Experimental 0827 | Experimental release (August 27th, 2024) of Gemini 1.5 Flash-8B, our smallest and most cost effective Flash model. Replaced by Gemini-1.5-flash-8b-001 (stable). | 001 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-1.5-flash-8b-exp-0924 | Gemini 1.5 Flash 8B Experimental 0924 | Experimental release (September 24th, 2024) of Gemini 1.5 Flash-8B, our smallest and most cost effective Flash model. Replaced by Gemini-1.5-flash-8b-001 (stable). | 001 | 1000000 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemini-2.5-pro-exp-03-25 | Gemini 2.5 Pro Experimental 03-25 | Experimental release (March 25th, 2025) of Gemini 2.5 Pro | 2.5-exp-03-25 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.5-pro-preview-03-25 | Gemini 2.5 Pro Preview 03-25 | Gemini 2.5 Pro Preview 03-25 | 2.5-preview-03-25 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.5-flash-preview-04-17 | Gemini 2.5 Flash Preview 04-17 | Preview release (April 17th, 2025) of Gemini 2.5 Flash | 2.5-preview-04-17 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-exp | Gemini 2.0 Flash Experimental | Gemini 2.0 Flash Experimental | 2.0 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘bidiGenerateContent’] |
models/gemini-2.0-flash | Gemini 2.0 Flash | Gemini 2.0 Flash | 2.0 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-001 | Gemini 2.0 Flash 001 | Stable version of Gemini 2.0 Flash, our fast and versatile multimodal model for scaling across diverse tasks, released in January of 2025. | 2.0 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-lite-001 | Gemini 2.0 Flash-Lite 001 | Stable version of Gemini 2.0 Flash Lite | 2.0 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-lite | Gemini 2.0 Flash-Lite | Gemini 2.0 Flash-Lite | 2.0 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-lite-preview-02-05 | Gemini 2.0 Flash-Lite Preview 02-05 | Preview release (February 5th, 2025) of Gemini 2.0 Flash Lite | preview-02-05 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-lite-preview | Gemini 2.0 Flash-Lite Preview | Preview release (February 5th, 2025) of Gemini 2.0 Flash Lite | preview-02-05 | 1048576 | 8192 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-pro-exp | Gemini 2.0 Pro Experimental | Experimental release (March 25th, 2025) of Gemini 2.5 Pro | 2.5-exp-03-25 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-pro-exp-02-05 | Gemini 2.0 Pro Experimental 02-05 | Experimental release (March 25th, 2025) of Gemini 2.5 Pro | 2.5-exp-03-25 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-exp-1206 | Gemini Experimental 1206 | Experimental release (March 25th, 2025) of Gemini 2.5 Pro | 2.5-exp-03-25 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-thinking-exp-01-21 | Gemini 2.5 Flash Preview 04-17 | Preview release (April 17th, 2025) of Gemini 2.5 Flash | 2.5-preview-04-17 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-thinking-exp | Gemini 2.5 Flash Preview 04-17 | Preview release (April 17th, 2025) of Gemini 2.5 Flash | 2.5-preview-04-17 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/gemini-2.0-flash-thinking-exp-1219 | Gemini 2.5 Flash Preview 04-17 | Preview release (April 17th, 2025) of Gemini 2.5 Flash | 2.5-preview-04-17 | 1048576 | 65536 | [‘generateContent’, ‘countTokens’, ‘createCachedContent’] |
models/learnlm-1.5-pro-experimental | LearnLM 1.5 Pro Experimental | Alias that points to the most recent stable version of Gemini 1.5 Pro, our mid-size multimodal model that supports up to 2 million tokens. | 001 | 32767 | 8192 | [‘generateContent’, ‘countTokens’] |
models/learnlm-2.0-flash-experimental | LearnLM 2.0 Flash Experimental | LearnLM 2.0 Flash Experimental | 2.0 | 1048576 | 32768 | [‘generateContent’, ‘countTokens’] |
models/gemma-3-1b-it | Gemma 3 1B | nan | 001 | 32768 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemma-3-4b-it | Gemma 3 4B | nan | 001 | 32768 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemma-3-12b-it | Gemma 3 12B | nan | 001 | 32768 | 8192 | [‘generateContent’, ‘countTokens’] |
models/gemma-3-27b-it | Gemma 3 27B | nan | 001 | 131072 | 8192 | [‘generateContent’, ‘countTokens’] |
models/embedding-001 | Embedding 001 | Obtain a distributed representation of a text. | 001 | 2048 | 1 | [‘embedContent’] |
models/text-embedding-004 | Text Embedding 004 | Obtain a distributed representation of a text. | 004 | 2048 | 1 | [‘embedContent’] |
models/gemini-embedding-exp-03-07 | Gemini Embedding Experimental 03-07 | Obtain a distributed representation of a text. | exp-03-07 | 8192 | 1 | [‘embedContent’, ‘countTextTokens’] |
models/gemini-embedding-exp | Gemini Embedding Experimental | Obtain a distributed representation of a text. | exp-03-07 | 8192 | 1 | [‘embedContent’, ‘countTextTokens’] |
models/aqa | Model that performs Attributed Question Answering. | Model trained to return answers to questions that are grounded in provided sources, along with estimating answerable probability. | 001 | 7168 | 1024 | [‘generateAnswer’] |
models/imagen-3.0-generate-002 | Imagen 3.0 002 model | Vertex served Imagen 3.0 002 model | 002 | 480 | 8192 | [‘predict’] |
models/gemini-2.0-flash-live-001 | Gemini 2.0 Flash 001 | Gemini 2.0 Flash 001 | 001 | 131072 | 8192 | [‘bidiGenerateContent’, ‘countTokens’] |
Rate limits
- Source: https://ai.google.dev/gemini-api/docs/rate-limits#current-rate-limits
- As of writing, these are the free tier limits (see table below).
Model | RPM | TPM | RPD |
---|---|---|---|
Gemini 2.5 Flash Preview 04-17 | 10 | 250,000 | 500 |
Gemini 2.5 Pro Experimental | 5 | 250,000 | 25 |
Gemini 2.5 Pro Preview | — | — | — |
Gemini 2.0 Flash | 15 | 1,000,000 | 1,500 |
Gemini 2.0 Flash Experimental (including image generation) | 10 | 1,000,000 | 1,500 |
Gemini 2.0 Flash-Lite | 30 | 1,000,000 | 1,500 |
Gemini 1.5 Flash | 15 | 1,000,000 | 1,500 |
Gemini 1.5 Flash-8B | 15 | 1,000,000 | 1,500 |
… | … | … | … |
Gemini 2.5 Flash Preview
Google’s first hybrid reasoning model which supports a 1M token context and has thinking budgets.
Free Tier | Paid Tier, per 1M tokens in USD | |
---|---|---|
Input price | Free of charge | 1.00 (audio) |
Output price | Free of charge | Non-thinking: 3.50 |
Context caching price | Not available | 0.25 (audio) $1.00 / 1,000,000 tokens per hour |
Grounding with Google Search | Free of charge, up to 500 RPD | 1,500 RPD (free), then $35 / 1,000 requests |
Text-to-speech ( gemini-2.5-flash-preview-tts ) | Free of charge | 10.00 (Output) |
Used to improve our products | Yes | No |