Run DeepSeek R1 locally on macOS
Updated on March 19, 2025   •   Published on January 27, 2025   •   12 min read

Run DeepSeek R1 locally on macOS

Jarek CeborskiJarek Ceborski

What is DeepSeek R1?

DeepSeek R1 is a first-generation reasoning model developed by DeepSeek AI that excels at complex reasoning tasks with performance comparable to OpenAI-o1 model. Published in January 2025, the model was trained using innovative reinforcement learning techniques to enhance reasoning capabilities. The research paper introducing DeepSeek R1 can be found at arXiv:2501.12948.

What makes DeepSeek R1 unique is its training approach - it was developed through large-scale reinforcement learning (RL) with minimal reliance on supervised fine-tuning. This resulted in the model naturally developing powerful reasoning behaviors. While the initial DeepSeek-R1-Zero model (trained purely with RL) showed remarkable reasoning capabilities, it had challenges with repetition and readability. The final DeepSeek R1 model addressed these issues by incorporating some initial supervised data before RL training.

Benefits of Running Locally

Running DeepSeek R1 locally on your Mac offers several advantages:

  1. Privacy: Your data stays on your device and doesn't go through external servers
  2. Offline Usage: No internet connection required once models are downloaded
  3. Cost-effective: No API costs or usage limitations
  4. Low Latency: Direct access without network delays
  5. Customization: Full control over model parameters and settings

For macOS users, options to run LLMs like DeepSeek R1 include using specialized platforms such as Ollama or LM Studio, which make it easy to download and run models without complex setups.

Understanding Model Variants

Distilled Models

Model distillation is a technique where a smaller model (student) is trained to mimic the behavior of a larger model (teacher). In DeepSeek R1's case, the researchers demonstrated that the reasoning patterns of the large 671B parameter model could be effectively transferred to smaller models, making them more accessible while maintaining strong performance. This process allows smaller models to achieve better results compared to models trained directly through reinforcement learning at those sizes.

Llama vs. Qwen Base Models

DeepSeek R1 distilled models are based on two different foundation model architectures, each with its own characteristics:

Llama-based models (8B and 70B variants):

  • Built on Meta's Llama 3 architecture, which uses a traditional transformer architecture with optimizations for compute efficiency
  • Key features:
    • Rotary positional embeddings (RoPE) for better handling of sequential data
    • Group Query Attention (GQA) for improved parallel processing
    • Sliding Window Attention for handling longer sequences
    • Strong performance on English language tasks and coding
    • Extensively tested and widely adopted in the open-source community

Qwen-based models (1.5B, 7B, 14B, and 32B variants):

  • Based on Alibaba's Qwen 2.5 architecture, which introduces several architectural innovations
  • Key features:
    • Multi-query attention mechanism optimized for both English and Chinese
    • Enhanced context window (up to 32K tokens)
    • Native support for Chinese text segmentation
    • Better handling of mixed Chinese-English content
    • Improved performance on mathematical reasoning tasks
    • Optimized for both academic and commercial applications

The choice between Llama and Qwen variants depends on your specific needs:

  • Choose Llama variants for:
    • Primarily English language applications
    • Code generation and analysis
    • Projects requiring extensive community support
    • Applications needing proven stability
  • Choose Qwen variants for:
    • Multilingual applications, especially involving Chinese
    • Mathematical and scientific tasks
    • Projects requiring longer context windows
    • Applications needing balanced performance across different domains

Hardware Requirements

Here's a breakdown of popular DeepSeek R1 models available on Ollama, along with their approximate sizes and hardware recommendations:

ModelParametersSizeVRAM (Approx.)Recommended Mac
deepseek-r1:1.5b1.5B1.1 GB~2 GBM2/M3 MacBook Air (8GB RAM+)
deepseek-r1:7b7B4.7 GB~5 GBM2/M3/M4 MacBook Pro (16GB RAM+)
deepseek-r1:8b8B4.9 GB~6 GBM2/M3/M4 MacBook Pro (16GB RAM+)
deepseek-r1:14b14B9.0 GB~10 GBM2/M3/M4 Pro MacBook Pro (32GB RAM+)
deepseek-r1:32b32B20 GB~22 GBM2 Max/Ultra Mac Studio
deepseek-r1:70b70B43 GB~45 GBM2 Ultra Mac Studio
deepseek-r1:1.5b-qwen-distill-q4_K_M1.5B1.1 GB~2 GBM2/M3 MacBook Air (8GB RAM+)
deepseek-r1:7b-qwen-distill-q4_K_M7B4.7 GB~5 GBM2/M3/M4 MacBook Pro (16GB RAM+)
deepseek-r1:8b-llama-distill-q4_K_M8B4.9 GB~6 GBM2/M3/M4 MacBook Pro (16GB RAM+)
deepseek-r1:14b-qwen-distill-q4_K_M14B9.0 GB~10 GBM2/M3/M4 Pro MacBook Pro (32GB RAM+)
deepseek-r1:32b-qwen-distill-q4_K_M32B20 GB~22 GBM2 Max/Ultra Mac Studio
deepseek-r1:70b-llama-distill-q4_K_M70B43 GB~45 GBM2 Ultra Mac Studio

Note: VRAM (Video RAM) usage can vary based on the model, task, and quantization. The above is an approximation. Models ending in q4_K_M are quantized for lower resource usage.

Step-by-Step Guide to Running DeepSeek R1 locally on macOS using Ollama and Kerlig

Demo of setting up DeepSeek R1 via Ollama and Kerlig and running it locally on Mac
  1. Install and run Ollama

    • Go to ollama.com and download the macOS installer
    • Install Ollama on your Mac
    • Open Ollama after installation
  2. Add DeepSeek R1 model to Kerlig

    • Download Kerlig and open it
    • Go to Settings → Integrations → Ollama
    • In Add Custom Model section:
      • Enter a display name (e.g., "DeepSeek R1 7B")
      • Enter the model name (e.g., deepseek-r1:7b)
    • Click Add
    • Toggle the switch to enable and download, wait for download to finish (you can close Settings while downloading continues in background)
  3. Run DeepSeek R1

    • Open Kerlig
    • Enter your prompt - any question you want to ask
    • Select newly added DeepSeek R1 7B as model
    • Click Run or press Enter

Usage Recommendations

For optimal performance with DeepSeek R1 models:

  1. Choose a model size appropriate for your Mac's specifications
  2. Start with smaller models first to test performance
  3. Monitor system resources during initial usage
  4. Ensure adequate free storage space for model downloads
  5. Keep Ollama running in the background while using Kerlig
  6. Avoid adding system prompts - include all instructions within the user prompt
  7. For mathematical problems, include directives like: "Please reason step by step, and put your final answer within \boxed"

Licensing

DeepSeek R1 models are licensed under the MIT License and support commercial use, including modifications and derivative works. Note that:

  • Qwen-based distilled models (1.5B, 7B, 14B, 32B) are derived from Qwen-2.5 series (Apache 2.0 License)
  • Llama-based 8B model is derived from Llama3.1-8B-Base
  • Llama-based 70B model is derived from Llama3.3-70B-Instruct

Troubleshooting

  • For large models, consider closing other resource-intensive applications
  • If a model fails to load, check your available system memory
  • Ensure Ollama is running before attempting to use models in Kerlig
  • If experiencing issues, try restarting Ollama and Kerlig

Conclusion

Remember that model downloads are persistent - once downloaded, you can use them offline in future sessions.

By following these steps, you can successfully run DeepSeek R1 locally on your Mac.

The Revolutionary Reasoning Capabilities of DeepSeek R1

What truly sets DeepSeek R1 apart from previous generations of language models is its groundbreaking approach to reasoning. Unlike traditional language models that primarily focus on pattern recognition and next-token prediction, DeepSeek R1 was specifically designed to develop genuine reasoning capabilities.

How DeepSeek R1's Reasoning Works

DeepSeek R1's reasoning capabilities stem from its unique training methodology:

  1. Pure Reinforcement Learning Approach: Instead of relying heavily on supervised fine-tuning with human demonstrations, DeepSeek R1 was trained primarily through reinforcement learning from the ground up. This approach allows the model to develop reasoning strategies organically rather than mimicking human-provided solutions.

  2. Chain-of-Thought Emergence: One of the most remarkable aspects of DeepSeek R1 is that it naturally develops chain-of-thought reasoning without explicit instruction. The model breaks down complex problems into logical steps, working through each component systematically before arriving at a conclusion.

  3. Self-Consistency Verification: DeepSeek R1 frequently verifies its own work, checking intermediate results and correcting errors before providing final answers. This self-monitoring process significantly improves accuracy on complex tasks.

  4. Recursive Problem Decomposition: When facing difficult problems, the model automatically decomposes them into simpler sub-problems, solves each individually, and then integrates the results—a strategy closely resembling human expert problem-solving.

  5. Abstract Reasoning Transfer: DeepSeek R1 can transfer reasoning patterns across domains, applying logical frameworks from one field to solve seemingly unrelated problems in another—demonstrating true reasoning rather than mere memorization.

Game-Changing Advantages Over Traditional Models

DeepSeek R1's reasoning-first approach delivers several revolutionary advantages:

  1. Superior Performance on Novel Problems: While traditional models often struggle with problems they haven't explicitly seen during training, DeepSeek R1 excels at novel challenges by applying fundamental reasoning principles rather than pattern matching.

  2. Mathematical Reasoning Breakthrough: DeepSeek R1 demonstrates unprecedented capabilities in mathematical reasoning, achieving 69.8% accuracy on AIME (American Invitational Mathematics Examination) problems in the 14B model—a feat that was previously thought impossible for models of this size.

  3. Reduced Hallucination: The step-by-step reasoning process naturally reduces hallucinations, as each logical step constrains the next, preventing the model from drifting into unfounded conclusions.

  4. Explainable Outputs: By showing its reasoning process, DeepSeek R1 provides transparent explanations for its answers, making it more trustworthy and useful for critical applications in fields like medicine, science, and engineering.

  5. Generalization to Unseen Tasks: The reasoning capabilities generalize remarkably well to tasks not included in the training process, suggesting that DeepSeek R1 has developed fundamental cognitive abilities rather than task-specific heuristics.

Real-World Applications of DeepSeek R1's Reasoning

The advanced reasoning capabilities of DeepSeek R1 open up new possibilities across multiple domains:

  • Scientific Research: Formulating hypotheses and designing experiments based on existing literature
  • Education: Creating custom explanations that adapt to a student's level of understanding
  • Software Development: Designing algorithms and debugging complex code issues
  • Financial Analysis: Evaluating investment strategies through multi-step risk assessment
  • Medical Diagnosis: Working through differential diagnoses based on patient symptoms and test results

DeepSeek R1 represents a paradigm shift in AI development—moving from models that primarily predict text to systems that can truly reason through complex problems. This transition from pattern-recognition to genuine reasoning marks a significant step toward more general artificial intelligence.

DeepSeek R1 Reasoning in Action: A Demo!

To fully appreciate the reasoning capabilities of DeepSeek R1, seeing it in action provides the most compelling evidence. The following demonstration showcases how the DeepSeek R1 14b model tackles a research-oriented task requiring analytical thinking and information synthesis.

Demo of DeepSeek R1 14b with reasonig running locally on Mac

Understanding the Demonstration

In this video, we present DeepSeek R1 14b with a research-oriented query:

Query: Help me find information about deep research models (e.g. from OpenAI, Perplexity, Google, DeepSeek) for a blog post that I'm planning to write.

Watch as DeepSeek R1:

  1. Identifies the key components: Recognizing the different research organizations and the types of models to focus on
  2. Structures the information: Creates a logical framework to organize details about various research models
  3. Provides comprehensive analysis: Offers detailed information about each organization's cutting-edge models
  4. Compares and contrasts: Highlights the unique aspects and capabilities of different research models
  5. Analyzes technical details: Explains the architectural innovations that make each model distinctive
  6. Identifies practical applications: Connects model capabilities to real-world use cases
  7. Synthesizes a comprehensive overview: Pulls together the disparate information into a cohesive narrative

What's remarkable about this demonstration is not just the breadth of information DeepSeek R1 provides, but how it structures and reasons through the information. The model doesn't simply list facts; it approaches the task systematically, organizing information in a way that would be genuinely useful for writing a blog post.

Key Observations

Several aspects of this demonstration highlight DeepSeek R1's reasoning capabilities:

  • Structured Knowledge Organization: The model naturally organizes information into logical categories without explicit instructions to do so.
  • Analytical Comparison: Without being prompted, the model identifies meaningful points of comparison between different research models.
  • Nuanced Understanding: The model demonstrates a sophisticated grasp of the technical differences between various research approaches.
  • Contextual Relevance: Information is presented with awareness of what would be most useful for blog post creation.
  • Synthesized Insights: Rather than just presenting isolated facts, the model draws connections and identifies trends across the research landscape.

This demonstration exemplifies why DeepSeek R1 represents a significant advancement in AI reasoning capabilities, showing how even the 14b parameter model can tackle complex research tasks that require sophisticated information processing and organization—capabilities that extend far beyond simple pattern matching.


FAQ: Frequently Asked Questions About DeepSeek R1

Q: Which model size should I choose?

Answer: Consider these recommendations:

  • Basic tasks: 1.5B model
  • Balanced performance: 7B/8B models
  • Maximum capabilities: 14B+ models (if hardware supports it)
  • Start with smaller models and scale up if needed

Q: What are the minimum hardware requirements for DeepSeek R1?

Answer: Requirements vary by model size:

  • 1.5B model: M2/M3 MacBook Air with 8GB RAM
  • 7B/8B models: M2/M3 MacBook Pro with 16GB RAM
  • 14B model: M2/M3 Pro MacBook Pro with 32GB RAM
  • 32B/70B models: M2 Max/Ultra Mac Studio

Q: How much storage space do the models require?

Answer: Storage requirements per model:

  • 1.5B model: ~1.1 GB
  • 7B model: ~4.7 GB
  • 14B model: ~9.0 GB
  • 32B model: ~20 GB
  • 70B model: ~43 GB

Q: What is the response speed of different model sizes?

Answer: Speed varies by model and hardware:

  • 1.5B model: 1-2 seconds on M2/M3 Macs
  • 7B/8B models: 2-4 seconds on 16GB RAM machines
  • 14B+ models: 4-8 seconds on higher-end hardware
  • Quantized (q4_K_M) versions improve speed by 30-50%

Q: What tasks is DeepSeek R1 best suited for?

Answer: DeepSeek R1 excels at:

  • Mathematics and complex problem-solving
  • Coding and algorithm challenges
  • Scientific reasoning
  • Step-by-step explanations
  • Analytical thinking tasks

Q: How do different model sizes compare in performance?

Answer: Performance scales with model size:

  • 1.5B model: 28.9% AIME accuracy, outperforms GPT-4 on certain math tasks
  • 7B model: 55.5% AIME accuracy
  • 14B model: 69.7% AIME accuracy, close to larger models
  • All sizes demonstrate strong reasoning capabilities due to distillation technology

Q: How does DeepSeek R1 compare to other local LLMs?

Answer: DeepSeek R1 distinguishes itself by:

  • Outperforming larger models on math and coding tasks
  • Excelling at step-by-step problem solving
  • Strong scientific reasoning capabilities
  • Efficient resource usage through quantization
  • Competitive performance at smaller model sizes

Q: Can DeepSeek R1 run offline?

Answer: Yes, after downloading through Ollama, DeepSeek R1 runs completely offline, providing:

  • Complete privacy
  • No API costs
  • Reliable access
  • Lower latency

Q: Can multiple models run simultaneously?

Answer: While possible, it's recommended to run one model at a time for optimal performance, especially on machines with limited RAM.

Q: Does performance differ on Apple Silicon vs Intel Macs?

Answer: Yes, Apple Silicon Macs (M1/M2/M3) offer significantly better performance due to:

  • Unified memory architecture
  • Optimized ML capabilities
  • Specific optimizations for Apple Silicon

References