You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Multimodal assistant with Phi-4-mini-multimodal and OpenVINO
2
+
3
+
Phi-4-multimodal-instruct is a lightweight open multimodal foundation model. The model processes text, image, and audio inputs, generating text outputs. Phi-4-multimodal-instruct has 5.6B parameters and is a multimodal transformer model. The model has the pretrained Phi-4-mini as the backbone language model, and the advanced encoders and adapters of vision and speech.
4
+
In this tutorial we will explore how to run Phi-4-mini-multimodal model using [OpenVINO](https://github.com/openvinotoolkit/openvino) and optimize it using [NNCF](https://github.com/openvinotoolkit/nncf).
5
+
6
+
## Notebook contents
7
+
The tutorial consists from following steps:
8
+
9
+
- Install requirements
10
+
- Convert and Optimize model
11
+
- Run OpenVINO model inference
12
+
- Launch Interactive demo
13
+
14
+
In this demonstration, you'll create interactive chatbot that can answer questions about provided image's and audio's content.
0 commit comments