English Readme • | Chinese Readme
Unlock-DeepSeek is a series of works dedicated to interpreting, expanding, and reproducing DeepSeek's innovative achievements on the path to AGI for a wide audience of AI enthusiasts. It aims at disseminating these innovations and providing hands-on projects from scratch to master the cutting-edge technologies in large language models.
- Beginners with foundational knowledge in large language models and university-level mathematical skills.
- Learners interested in understanding deep reasoning further.
- Professionals looking to apply reasoning models in their work.
We dissect the DeepSeek-R1 and its related works into three significant parts:
- MoE
- Reasoning Models
- Key Elements (Data, Infra, ...)
Instead of focusing merely on cost-effectiveness, we emphasize the innovative practices of DeepSeek towards achieving AGI, breaking down its publicly available work into digestible segments for a broader audience. We also compare and introduce similar works like Kimi-K1.5 to showcase different possibilities on the path to AGI.
Additionally, we will explore reproduction schemes for DeepSeek-R1 by integrating contributions from other communities, providing Chinese-language reproduction tutorials.
- MoE: The Architecture Upheld by DeepSeek
- Deployment of DeepSeek-R1 Distilled Model (Qwen) (self-llm/DeepSeek-R1-Distill-Qwen)
- A Retrospective on the Evolution of MoE
- Implementing MoE from Scratch (tiny-universe/Tiny MoE)
- [Multiple Subsections] Decoding MoE Design in DeepSeek Models (with Implementation)
- Reasoning Models: The Critical Technology of DeepSeek-R1
- Introduction to Reasoning Models
- LLM and Reasoning
- Visualization of Reasoning Effects
- OpenAI-o1 and Inference Scaling Law
- Qwen-QwQ and Qwen-QVQ
- DeepSeek-R1 and DeepSeek-R1-Zero
- Kimi-K1.5
- Key Algorithmic Principles of Reasoning Models (covering as much technology as possible introduced in
2.1 Introduction to Reasoning Models
)- CoT, ToT, GoT
- Monte Carlo Tree Search MCTS
- Quick Overview of Reinforcement Learning Concepts
- DPO, PPO, GRPO
- ...
- Introduction to Reasoning Models
- [Experimental] Keys: Why DeepSeek Is Cost-Effective and Efficient
Due to the lack of extensive documentation, this section can only be approached with our best efforts.
- Data
- Infra
- Tricks
- Distillation
- ...
Name | Role | Bio |
---|---|---|
XiuTao Luo | Project Leader | SiLiang Lab |
ShuFan Jiang | Project Leader | |
JiaNuo Chen | Responsible for Infra Part | Guangzhou University |
JingHao Lin | Interpreting GRPO Algorithm | Zhipu.AI |
Kaijun Deng | Kimi-K1.5 Paper Explanation | Shenzhen University |
- If you find any issues, feel free to open an issue or contact the caretaker team if there's no response.
- If you wish to contribute to this project, please submit a pull request or reach out to the caretaker team if it goes unanswered.
- If you're interested in starting a new project with Datawhale, follow the guidelines provided in the Datawhale Open Source Project Guide.
Our heartfelt thanks go out to the following open-source resources and assistance that made this project possible: DeepSeek, Open-R1, trl, mini-deepseek-r1 (our initial codebase), TinyZero, flash-attn, modelscope, vllm.
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey" />``</a><br />
This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>
.
Note: Default use of CC 4.0 license, but other licenses can be selected based on the specifics of your project.