|
23 | 23 | "- [Convert model with OpenVINO](#Convert-model-with-OpenVINO)\n",
|
24 | 24 | " - [Convert model using Optimum Intel](#Convert-model-using-Optimum-Intel)\n",
|
25 | 25 | " - [Compress model weights](#Compress-model-weights)\n",
|
| 26 | + " - [Use optimized models provided on HuggingFace Hub](#use-optimized-models-provided-on-huggingface-hub)\n", |
26 | 27 | "- [Run OpenVINO model inference](#Run-OpenVINO-model-inference)\n",
|
27 | 28 | "- [Interactive demo](#Interactive-demo)\n",
|
28 | 29 | "\n",
|
|
215 | 216 | "### Compress model weights\n",
|
216 | 217 | "[back to top ⬆️](#Table-of-contents:)\n",
|
217 | 218 | "\n",
|
218 |
| - "For reducing model memory consumption we will use weights compression. The [Weights Compression](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html) algorithm is aimed at compressing the weights of the models and can be used to optimize the model footprint and performance of large models where the size of weights is relatively larger than the size of activations, for example, Large Language Models (LLM). Compared to INT8 compression, INT4 compression improves performance even more, but introduces a minor drop in prediction quality. We will use [NNCF](https://github.com/openvinotoolkit/nncf) integration to `optimum-cli` tool for weight compression.\n" |
| 219 | + "For reducing model memory consumption we will use weights compression. The [Weights Compression](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html) algorithm is aimed at compressing the weights of the models and can be used to optimize the model footprint and performance of large models where the size of weights is relatively larger than the size of activations, for example, Large Language Models (LLM). Compared to INT8 compression, INT4 compression improves performance even more, but introduces a minor drop in prediction quality. We will use [NNCF](https://github.com/openvinotoolkit/nncf) integration to `optimum-cli` tool for weight compression.\n", |
| 220 | + "\n", |
| 221 | + "### Use optimized models provided on HuggingFace Hub\n", |
| 222 | + "[back to top ⬆️](#Table-of-contents:)\n", |
| 223 | + "\n", |
| 224 | + "For quick start, OpenVINO provides [collection](https://huggingface.co/collections/OpenVINO/image-generation-67697d9952fb1eee4a252aa8) of optimized models that are ready to use with OpenVINO GenAI. You can download them using following command:\n", |
| 225 | + "\n", |
| 226 | + "```bash\n", |
| 227 | + "huggingface-cli download <model_id> --local-dir <output_dir>\n", |
| 228 | + "```\n" |
219 | 229 | ]
|
220 | 230 | },
|
221 | 231 | {
|
222 | 232 | "cell_type": "code",
|
223 | 233 | "execution_count": 4,
|
224 | 234 | "id": "add19dd3",
|
225 |
| - "metadata": {}, |
| 235 | + "metadata": { |
| 236 | + "test_replace": { |
| 237 | + "use_preconverted = widgets.Checkbox(value=True": "use_preconverted = widgets.Checkbox(value=False" |
| 238 | + } |
| 239 | + }, |
226 | 240 | "outputs": [
|
227 | 241 | {
|
228 | 242 | "data": {
|
|
241 | 255 | }
|
242 | 256 | ],
|
243 | 257 | "source": [
|
| 258 | + "use_preconverted = widgets.Checkbox(value=True, description=\"Use preconverted model\", disable=False)\n", |
| 259 | + "\n", |
244 | 260 | "to_compress = widgets.Checkbox(\n",
|
245 | 261 | " value=True,\n",
|
246 | 262 | " description=\"Weight compression\",\n",
|
247 | 263 | " disabled=False,\n",
|
248 | 264 | ")\n",
|
249 | 265 | "\n",
|
250 |
| - "to_compress" |
| 266 | + "visible_widgets = [to_compress]\n", |
| 267 | + "\n", |
| 268 | + "if \"schnell\" in model_selector.value:\n", |
| 269 | + " visible_widgets.append(use_preconverted)\n", |
| 270 | + "\n", |
| 271 | + "options = widgets.VBox(visible_widgets)\n", |
| 272 | + "\n", |
| 273 | + "options" |
251 | 274 | ]
|
252 | 275 | },
|
253 | 276 | {
|
|
287 | 310 | "from cmd_helper import optimum_cli\n",
|
288 | 311 | "\n",
|
289 | 312 | "if not model_dir.exists():\n",
|
290 |
| - " optimum_cli(model_id, model_dir, additional_args=additional_args)" |
| 313 | + " if not use_preconverted.value:\n", |
| 314 | + " optimum_cli(model_id, model_dir, additional_args=additional_args)\n", |
| 315 | + " else:\n", |
| 316 | + " ov_model_id = f\"OpenVINO/{model_id.split('/')[-1]}-{model_dir.name.lower()}-ov\"\n", |
| 317 | + " !huggingface-cli download {ov_model_id} --local-dir {model_dir}" |
291 | 318 | ]
|
292 | 319 | },
|
293 | 320 | {
|
|
367 | 394 | ]
|
368 | 395 | },
|
369 | 396 | {
|
| 397 | + "attachments": {}, |
370 | 398 | "cell_type": "markdown",
|
371 | 399 | "id": "cab6790e",
|
372 | 400 | "metadata": {},
|
|
397 | 425 | ]
|
398 | 426 | },
|
399 | 427 | {
|
| 428 | + "attachments": {}, |
400 | 429 | "cell_type": "markdown",
|
401 | 430 | "id": "75aeb184",
|
402 | 431 | "metadata": {},
|
|
0 commit comments