cherry pick doc fix #3238 from main to release2.5 (#3240)

lanluo-nvidia · peri044 · web-flow · commit f2e1e6c7d200 · 2024-10-16T14:23:01.000-07:00
Co-authored-by: Dheeraj Peri &lt;peri.dheeraj@gmail.com&gt;
diff --git a/docsrc/index.rst b/docsrc/index.rst
@@ -48,10 +48,35 @@ User Guide
    user_guide/saving_models
    user_guide/runtime
    user_guide/using_dla
+
+
+Tutorials
+------------
+
+* :ref:`torch_compile_advanced_usage`
+* :ref:`vgg16_ptq`
+* :ref:`engine_caching_example`
+* :ref:`engine_caching_bert_example`
+* :ref:`refit_engine_example`
+* :ref:`serving_torch_tensorrt_with_triton`
+* :ref:`torch_export_cudagraphs`
+* :ref:`custom_kernel_plugins`
+* :ref:`mutable_torchtrt_module_example`
+
+.. toctree::
+   :caption: Tutorials
+   :maxdepth: 1
+   :hidden:
+
    tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage
    tutorials/_rendered_examples/dynamo/vgg16_ptq
    tutorials/_rendered_examples/dynamo/engine_caching_example
+   tutorials/_rendered_examples/dynamo/engine_caching_bert_example
    tutorials/_rendered_examples/dynamo/refit_engine_example
+   tutorials/serving_torch_tensorrt_with_triton
+   tutorials/_rendered_examples/dynamo/torch_export_cudagraphs
+   tutorials/_rendered_examples/dynamo/custom_kernel_plugins
+   tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example
 
 Dynamo Frontend
 ----------------
@@ -97,27 +122,28 @@ FX Frontend
 
    fx/getting_started_with_fx_path
 
-Tutorials
+Model Zoo
 ------------
-* :ref:`torch_tensorrt_tutorials`
-* :ref:`serving_torch_tensorrt_with_triton`
+* :ref:`torch_compile_resnet`
+* :ref:`torch_compile_transformer`
+* :ref:`torch_compile_stable_diffusion`
+* :ref:`torch_export_gpt2`
+* :ref:`torch_export_llama2`
 * :ref:`notebooks`
 
 .. toctree::
-   :caption: Tutorials
+   :caption: Model Zoo
    :maxdepth: 3
    :hidden:
-
-   tutorials/serving_torch_tensorrt_with_triton
-   tutorials/notebooks
+   
    tutorials/_rendered_examples/dynamo/torch_compile_resnet_example
    tutorials/_rendered_examples/dynamo/torch_compile_transformers_example
    tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion
-   tutorials/_rendered_examples/dynamo/torch_export_cudagraphs
-   tutorials/_rendered_examples/dynamo/custom_kernel_plugins
    tutorials/_rendered_examples/distributed_inference/data_parallel_gpt2
    tutorials/_rendered_examples/distributed_inference/data_parallel_stable_diffusion
-   tutorials/_rendered_examples/dynamo/mutable_torchtrt_module_example
+   tutorials/_rendered_examples/dynamo/torch_export_gpt2
+   tutorials/_rendered_examples/dynamo/torch_export_llama2
+   tutorials/notebooks
 
 Python API Documentation
 ------------------------
@@ -214,4 +240,4 @@ Legacy Further Information (TorchScript)
 * `GTC 2021 Fall Talk <https://www.nvidia.com/en-us/on-demand/session/gtcfall21-a31107/>`_
 * `PyTorch Ecosystem Day 2021 <https://assets.pytorch.org/pted2021/posters/I6.png>`_
 * `PyTorch Developer Conference 2021 <https://s3.amazonaws.com/assets.pytorch.org/ptdd2021/posters/D2.png>`_
-* `PyTorch Developer Conference 2022 <https://pytorch.s3.amazonaws.com/posters/ptc2022/C04.pdf>`_
+* `PyTorch Developer Conference 2022 <https://pytorch.s3.amazonaws.com/posters/ptc2022/C04.pdf>`_
diff --git a/docsrc/tutorials/notebooks.rst b/docsrc/tutorials/notebooks.rst
@@ -1,10 +1,9 @@
 .. _notebooks:
 
-Example notebooks
+Legacy notebooks
 ===================
 
-There exists a number of notebooks which cover specific using specific features and models
-with Torch-TensorRT
+There exists a number of notebooks which demonstrate different model conversions / features / frontends available within Torch-TensorRT
 
 Notebooks
 ------------
diff --git a/examples/README.rst b/examples/README.rst
@@ -1,7 +1,4 @@
 .. _torch_tensorrt_tutorials:
 
 Torch-TensorRT Tutorials
-===========================
-
-The user guide covers the basic concepts and usage of Torch-TensorRT.
-We also provide a number of tutorials to explore specific usecases and advanced concepts
+===========================
diff --git a/examples/dynamo/README.rst b/examples/dynamo/README.rst
@@ -1,19 +1,22 @@
-.. _torch_compile:
+.. _torch_tensorrt_examples:
 
-Dynamo / ``torch.compile``
-----------------------------
+Here we provide examples of Torch-TensorRT compilation of popular computer vision and language models.
 
-Torch-TensorRT provides a backend for the new ``torch.compile`` API released in PyTorch 2.0. In the following examples we describe
-a number of ways you can leverage this backend to accelerate inference.
+Dependencies
+------------------------------------
 
+Please install the following external dependencies (assuming you already have correct `torch`, `torch_tensorrt` and `tensorrt` libraries installed (`dependencies <https://github.com/pytorch/TensorRT?tab=readme-ov-file#dependencies>`_))
+
+.. code-block:: python
+
+    pip install -r requirements.txt
+
+
+Model Zoo
+------------------------------------
 * :ref:`torch_compile_resnet`: Compiling a ResNet model using the Torch Compile Frontend for ``torch_tensorrt.compile``
 * :ref:`torch_compile_transformer`: Compiling a Transformer model using ``torch.compile``
-* :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API
 * :ref:`torch_compile_stable_diffusion`: Compiling a Stable Diffusion model using ``torch.compile``
-* :ref:`torch_export_cudagraphs`: Using the Cudagraphs integration with `ir="dynamo"`
-* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines
-* :ref:`refit_engine_example`: Refitting a compiled TensorRT Graph Module with updated weights
-* :ref:`mutable_torchtrt_module_example`: Compile, use, and modify TensorRT Graph Module with MutableTorchTensorRTModule
-* :ref:`vgg16_fp8_ptq`: Compiling a VGG16 model with FP8 and PTQ using ``torch.compile``
-* :ref:`engine_caching_example`: Utilizing engine caching to speed up compilation times
-* :ref:`engine_caching_bert_example`: Demonstrating engine caching on BERT
+* :ref:`_torch_export_gpt2`: Compiling a GPT2 model using AOT workflow (`ir=dynamo`)
+* :ref:`_torch_export_llama2`: Compiling a Llama2 model using AOT workflow (`ir=dynamo`)
+
diff --git a/examples/dynamo/torch_compile_resnet_example.py b/examples/dynamo/torch_compile_resnet_example.py
@@ -1,7 +1,7 @@
 """
 .. _torch_compile_resnet:
 
-Compiling ResNet using the Torch-TensorRT `torch.compile` Backend
+Compiling ResNet with dynamic shapes using the `torch.compile` backend
 ==========================================================
 
 This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model."""
diff --git a/examples/dynamo/torch_compile_stable_diffusion.py b/examples/dynamo/torch_compile_stable_diffusion.py
@@ -1,7 +1,7 @@
 """
 .. _torch_compile_stable_diffusion:
 
-Torch Compile Stable Diffusion
+Compiling Stable Diffusion model using the `torch.compile` backend
 ======================================================
 
 This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a Stable Diffusion model. A sample output is featured below:
diff --git a/examples/dynamo/torch_compile_transformers_example.py b/examples/dynamo/torch_compile_transformers_example.py
@@ -1,10 +1,10 @@
 """
 .. _torch_compile_transformer:
 
-Compiling a Transformer using torch.compile and TensorRT
+Compiling BERT using the `torch.compile` backend
 ==============================================================
 
-This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a transformer-based model."""
+This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a BERT model."""
 
 # %%
 # Imports and Model Definition
diff --git a/examples/dynamo/torch_export_gpt2.py b/examples/dynamo/torch_export_gpt2.py
@@ -1,10 +1,10 @@
 """
 .. _torch_export_gpt2:
 
-Compiling GPT2 using the Torch-TensorRT with dynamo backend
+Compiling GPT2 using the dynamo backend
 ==========================================================
 
-This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a GPT2 model."""
+This script illustrates Torch-TensorRT workflow with dynamo backend on popular GPT2 model."""
 
 # %%
 # Imports and Model Definition
@@ -78,9 +78,10 @@
     tokenizer.decode(trt_gen_tokens[0], skip_special_tokens=True),
 )
 
-# %%
-# The output sentences should look like
+# Prompt : What is parallel programming ?
+
 # =============================
-# Pytorch model generated text:  I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
+# Pytorch model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that
+
 # =============================
-# TensorRT model generated text:  I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
+# TensorRT model generated text: The parallel programming paradigm is a set of programming languages that are designed to be used in parallel. The main difference between parallel programming and parallel programming is that
diff --git a/examples/dynamo/torch_export_llama2.py b/examples/dynamo/torch_export_llama2.py
@@ -1,10 +1,10 @@
 """
 .. _torch_export_llama2:
 
-Compiling Llama2 using the Torch-TensorRT with dynamo backend
+Compiling Llama2 using the dynamo backend
 ==========================================================
 
-This interactive script is intended as a sample of the Torch-TensorRT workflow with dynamo backend on a Llama2 model."""
+This script illustrates Torch-TensorRT workflow with dynamo backend on popular Llama2 model."""
 
 # %%
 # Imports and Model Definition
@@ -82,9 +82,11 @@
     )[0],
 )
 
-# %%
-# The output sentences should look like
+
+# Prompt : What is dynamic programming?
+
 # =============================
-# Pytorch model generated text:  I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
+# Pytorch model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and
+
 # =============================
-# TensorRT model generated text:  I enjoy walking with my cute dog, but I'm not sure if I'll ever be able to walk with my dog. I'm not sure if I'll ever be able to walk with my
+# TensorRT model generated text: Dynamic programming is an algorithmic technique used to solve complex problems by breaking them down into smaller subproblems, solving each subproblem only once, and