Unlock-DeepSeek

English Readme • | Chinese Readme

Unlock-DeepSeek is a series of works dedicated to interpreting, expanding, and reproducing DeepSeek's innovative achievements on the path to AGI for a wide audience of AI enthusiasts. It aims at disseminating these innovations and providing hands-on projects from scratch to master the cutting-edge technologies in large language models.

Audience

Beginners with foundational knowledge in large language models and university-level mathematical skills.
Learners interested in understanding deep reasoning further.
Professionals looking to apply reasoning models in their work.

Highlights

We dissect the DeepSeek-R1 and its related works into three significant parts:

MoE
Reasoning Models
Key Elements (Data, Infra, ...)

Instead of focusing merely on cost-effectiveness, we emphasize the innovative practices of DeepSeek towards achieving AGI, breaking down its publicly available work into digestible segments for a broader audience. We also compare and introduce similar works like Kimi-K1.5 to showcase different possibilities on the path to AGI.

Additionally, we will explore reproduction schemes for DeepSeek-R1 by integrating contributions from other communities, providing Chinese-language reproduction tutorials.

MoE: The Architecture Upheld by DeepSeek
1. Deployment of DeepSeek-R1 Distilled Model (Qwen) (self-llm/DeepSeek-R1-Distill-Qwen)
2. A Retrospective on the Evolution of MoE
3. Implementing MoE from Scratch (tiny-universe/Tiny MoE)
4. [Multiple Subsections] Decoding MoE Design in DeepSeek Models (with Implementation)
Reasoning Models: The Critical Technology of DeepSeek-R1
1. Introduction to Reasoning Models
  1. LLM and Reasoning
  2. Visualization of Reasoning Effects
  3. OpenAI-o1 and Inference Scaling Law
  4. Qwen-QwQ and Qwen-QVQ
  5. DeepSeek-R1 and DeepSeek-R1-Zero
  6. Kimi-K1.5
2. Key Algorithmic Principles of Reasoning Models (covering as much technology as possible introduced in 2.1 Introduction to Reasoning Models)
  1. CoT, ToT, GoT
  2. Monte Carlo Tree Search MCTS
  3. Quick Overview of Reinforcement Learning Concepts
  4. DPO, PPO, GRPO
  5. ...
[Experimental] Keys: Why DeepSeek Is Cost-Effective and Efficient

Due to the lack of extensive documentation, this section can only be approached with our best efforts.

Data
Infra
Tricks
Distillation
...

Contributors

Name	Role	Bio
XiuTao Luo	Project Leader	SiLiang Lab
ShuFan Jiang	Project Leader
JiaNuo Chen	Responsible for Infra Part	Guangzhou University
JingHao Lin	Interpreting GRPO Algorithm	Zhipu.AI
Kaijun Deng	Kimi-K1.5 Paper Explanation	Shenzhen University

Contributing

If you find any issues, feel free to open an issue or contact the caretaker team if there's no response.
If you wish to contribute to this project, please submit a pull request or reach out to the caretaker team if it goes unanswered.
If you're interested in starting a new project with Datawhale, follow the guidelines provided in the Datawhale Open Source Project Guide.

Acknowledgments

Our heartfelt thanks go out to the following open-source resources and assistance that made this project possible: DeepSeek, Open-R1, trl, mini-deepseek-r1 (our initial codebase), TinyZero, flash-attn, modelscope, vllm.

LICENSE

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey" />``</a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.

Note: Default use of CC 4.0 license, but other licenses can be selected based on the specifics of your project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README@en.md

README@en.md

Unlock-DeepSeek

Audience

Highlights

Table of Contents

Contributors

Contributing

Acknowledgments

Follow Us

LICENSE

Files

README@en.md

Latest commit

History

README@en.md

File metadata and controls

Unlock-DeepSeek

Audience

Highlights

Table of Contents

Contributors

Contributing

Acknowledgments

Follow Us

LICENSE