Published on February 8, 2025 • 6 min read

Running Gemini 2.0 Models on macOS

Jarek Ceborski

Google has just released their next generation of AI models - the Gemini 2.0 family. This major update brings significant improvements in performance, capabilities, and cost-effectiveness. Let's explore what these new models offer and how you can easily use them on your Mac with Kerlig.

The Gemini 2.0 Family

Gemini 2.0 Flash

The workhorse of the family, Flash is designed for high-volume, production-ready tasks. It offers:

1 million token context window
Multimodal capabilities (text, image, audio, video input)
Native tool use
Excellent performance/cost ratio
Best suited for: Chat applications, content generation, and document analysis

Gemini 2.0 Flash-Lite

A new cost-optimized model that maintains quality while being incredibly efficient:

Same 1 million token context window as Flash
Multimodal input support
75% lower cost than Flash
Best suited for: High-scale applications where cost optimization is crucial

Gemini 2.0 Pro

The most capable model in the family:

2 million token context window (largest available)
Superior coding performance
Enhanced reasoning capabilities
Best suited for: Complex coding tasks and advanced reasoning

Gemini 2.0 Flash Thinking (Experimental)

A specialized model that reveals its reasoning process:

Enhanced performance on math and science tasks
Same 1M token context window as Flash
Shows its step-by-step thinking process
Best suited for: Complex problem-solving where explainability is crucial

The Flash Thinking model demonstrates remarkable improvements in specific areas:

73.3% accuracy on AIME2024 math problems (compared to 35.5% for standard Flash)
74.2% on GPQA Diamond science questions (compared to 58.6% for standard Flash)
75.4% on multimodal reasoning tasks (MMMU benchmark)

What makes this model unique is its ability to show its reasoning process, making it especially valuable for:

Complex mathematical calculations
Scientific problem-solving
Educational applications where understanding the thought process is important
Tasks requiring transparent decision-making

Model Comparison

Model	Context Window	Input Price*	Output Price*	Best For
`Gemini 2.0 Flash`	1M tokens	$0.10	$0.40	General purpose, production
`Gemini 2.0 Flash-Lite`	1M tokens	$0.075	$0.30	Cost-efficient scaling
`Gemini 2.0 Pro`	2M tokens	-	-	Complex reasoning & coding
`Gemini 2.0 Flash Thinking`	1M tokens	-	-	Explainable problem-solving

*Price per 1M tokens

Benchmark Results

The Gemini 2.0 family shows impressive performance across various benchmarks:

General Performance

Pro leads with 79.1% on general MMLU-Pro tests
Flash achieves 77.6% on diverse subject matter tasks
Even the cost-effective Flash-Lite maintains strong 71.6% performance

Specialized Capabilities

Mathematics: Pro excels at complex math with 91.8% on MATH benchmark
Coding: Up to 36% accuracy on LiveCodeBench for Python generation
Reasoning: Strong performance on GPQA Diamond (up to 64.7% for Pro)
Multilingual: Excellent language capabilities with Pro reaching 86.5% on Global MMLU

Multimodal Skills

Strong video analysis (71.9% on EgoSchema)
Capable image understanding (72.7% on MMMU)
Competitive audio processing (40.6% on CoVoST2)

These benchmarks demonstrate that while Pro offers the highest performance, both Flash and Flash-Lite maintain strong capabilities across most tasks, making them excellent choices for production use cases.

Community Evaluation

According to LMArena.ai↗, a community-driven AI model evaluation platform with over 2.6 million user votes, Gemini 2.0 models are currently leading the leaderboard:

Gemini 2.0 Flash Thinking ranks #1 with an Arena Score of 1383
Gemini 2.0 Pro follows closely at #2 with a score of 1378
Both models outperform competitors like ChatGPT-4 (1365) and DeepSeek R1 (1362)

This real-world, community-driven evaluation demonstrates that Gemini 2.0 models not only excel in controlled benchmarks but also deliver superior performance in practical, day-to-day use cases.

Running Gemini 2.0 on Your Mac

Kerlig makes it easy to access these powerful models right from your macOS desktop. Here's how to get started:

1. Get Your API Key

First, you'll need to get your Google AI API key:

Visit Google AI Studio↗
Log in with your Google account
Click "Get API Key" and copy it
Enter the key in Kerlig's settings

2. Choose Your Model

In Kerlig, you can select different Gemini models based on your needs:

Use Flash for everyday tasks and general content creation
Choose Flash-Lite when working with high volumes of content
Select Pro for complex coding tasks or when you need advanced reasoning

3. Start Creating

With Kerlig's native macOS interface, you can:

Use ⌥ Option + Space to quickly access AI anywhere
Process multiple file types including PDFs, images, and code
Get AI assistance in any application
Create custom presets for repeated tasks

Why Choose Gemini 2.0?

The new Gemini models offer several compelling advantages:

Unprecedented Cost-Effectiveness: Gemini 2.0 models are dramatically more affordable than competitors, offering up to 90% cost savings:
- Processing 1M tokens with GPT-4 costs around $10
- The same workload with Gemini 2.0 Flash costs just $0.40
- Flash-Lite reduces costs even further at $0.30 per 1M tokens
This cost advantage becomes especially significant for large-scale processing. For example, when analyzing large documents like technical documentation or research papers, Gemini can process 6,000 pages of PDFs with better accuracy than competitors at a fraction of the cost.
Large Context Window: With up to 2M tokens context window in Pro, you can process massive documents or codebases in one go. That's equivalent to about 100,000 lines of code or 16 novels! This is significantly larger than competitors like GPT-4 and Claude, which are limited to 128K tokens.
Multimodal Capabilities: All models support multiple input types, including text, images, audio, and video, making them versatile for various use cases.

Getting Started

Ready to try Gemini 2.0 models on your Mac? Download Kerlig↗ and experience the power of these new models with our native macOS interface. Our app makes it easy to leverage these advanced AI capabilities in your daily workflow, whether you're coding, writing, or analyzing content.

For more detailed information about using Gemini models with Kerlig, check out our help documentation↗.

References

Official Gemini 2.0 Announcement↗ - Google's announcement of the Gemini 2.0 family
Gemini API Documentation↗ - Detailed API documentation and pricing
Flash Thinking Model↗ - Learn more about the Flash Thinking model
Kerlig API Key Guide↗ - Step-by-step guide for getting your Gemini API key
Google AI Studio↗ - Create your API key and experiment with Gemini models
Developers Blog↗ - Technical details about the Gemini 2.0 family expansion
LMArena.ai↗ - Community-driven AI model evaluation platform