Skip to content

Commit b189b61

Browse files
authored
Fix mistral v0.1 build instructions (#1373)
1 parent 309ab33 commit b189b61

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

examples/llama/README.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -590,16 +590,16 @@ The `--max_attention_window_size` parameter is set to the `sliding_window` value
590590
python convert_checkpoint.py --model_dir ./mistral-7b-v0.1 \
591591
--output_dir ./tllm_checkpoint_1gpu_mistral \
592592
--dtype float16
593-
trtllm-build --checkpoint_dir ./tllm_checkpoint_2gpu_gptq \
594-
--output_dir ./tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
595-
--gemm_plugin float16 \
596-
--max_input_len 32256
593+
trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_mistral \
594+
--output_dir ./tmp/mistral/7B/trt_engines/fp16/1-gpu/ \
595+
--gemm_plugin float16 \
596+
--max_input_len 32256
597597
598598
# Run Mistral 7B fp16 inference with sliding window/cache size 4096
599-
python3 run.py --max_output_len=50 \
600-
--tokenizer_dir ./tmp/llama/7B/ \
601-
--engine_dir=./tmp/llama/7B/trt_engines/fp16/1-gpu/ \
602-
--max_attention_window_size=4096
599+
python ../run.py --max_output_len=50 \
600+
--tokenizer_dir ./mistral-7b-v0.1 \
601+
--engine_dir=./tmp/llama/7B/trt_engines/fp16/1-gpu/ \
602+
--max_attention_window_size=4096
603603
```
604604

605605
Note that if you are comparing TRT-LLM with Huggingface,

0 commit comments

Comments
 (0)