Add link to DeepSeek FastDraft notebook in the DeepSeek notebook (#2722)

ofirzaf · web-flow · commit 7d2a168ac0bb · 2025-02-06T18:46:42.000+04:00
@eaidova
diff --git a/notebooks/deepseek-r1/README.md b/notebooks/deepseek-r1/README.md
@@ -13,6 +13,7 @@ The tutorial supports different models, you can select one from the provided opt
 * **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info.
 * **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks.  Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info.
 
+Learn how to accelerate **DeepSeek-R1-Distill-Llama-8B** with **FastDraft** and OpenVINO GenAI speculative decoding pipeline in this [notebook](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb)
 ## Notebook Contents
 
 The tutorial consists of the following steps:
diff --git a/notebooks/deepseek-r1/deepseek-r1.ipynb b/notebooks/deepseek-r1/deepseek-r1.ipynb
@@ -106,7 +106,7 @@
     "\n",
     "The tutorial supports different models, you can select one from the provided options to compare the quality of LLM solutions:\n",
     "\n",
-    "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info.\n",
+    "* **DeepSeek-R1-Distill-Llama-8B** is a distilled model based on [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B), that prioritizes high performance and advanced reasoning capabilities, particularly excelling in tasks requiring mathematical and factual precision. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) for more info. Note: this model can also be accelerated with [FastDraft](../../supplementary_materials/notebooks/fastdraft-deepseek/fastdraft_deepseek.ipynb).\n",
     "* **DeepSeek-R1-Distill-Qwen-1.5B** is the smallest DeepSeek-R1 distilled model based on [Qwen2.5-Math-1.5B](https://huggingface.co/Qwen/Qwen2.5-Math-1.5B). Despite its compact size, the model demonstrates strong capabilities in solving basic mathematical tasks, at the same time its programming capabilities are limited. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) for more info.\n",
     "* **DeepSeek-R1-Distill-Qwen-7B** is a distilled model based on [Qwen-2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B). The model demonstrates a good balance between mathematical and factual reasoning and can be less suited for complex coding tasks. Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) for more info.\n",
     "* **DeepSeek-R1-Distil-Qwen-14B** is a distilled model based on [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) that has great competence in factual reasoning and solving complex mathematical tasks.  Check [model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) for more info.\n",