This project is a medical chatbot designed to assist users by answering medical-related queries using a robust retrieval-augmented generation (RAG) framework. The chatbot leverages the Pinecone vector database for efficient document retrieval, Mistralai LLM from Hugging Face for natural language generation, and Gale Encyclopedia for Medicine as the primary knowledge base. It provides accurate, contextual, and relevant answers to medical queries in real time.
- Interactive Chat Interface: User-friendly chat interface built with HTML, Bootstrap, and jQuery.
- Retrieval-Augmented Generation (RAG): Combines the power of document retrieval with generative AI for accurate responses.
- Pinecone Vector Database: Ensures fast and efficient similarity searches within the knowledge base.
- Mistralai LLM: A cutting-edge language model from Hugging Face.
- Knowledge Base: Uses the Gale Encyclopedia for Medicine to provide authoritative information.
- Backend:
- Python (Flask Framework)
- Pinecone Vector Database
- Hugging Face's Mistralai LLM
- Frontend:
- HTML5, CSS3 (Bootstrap Framework)
- JavaScript (jQuery and AJAX)
- Environment Configuration:
.env
file for API keys and sensitive configurations
Ensure you have the following installed:
- Python 3.10 or later
- pip (Python package manager)
- Pinecone API key
- Hugging Face API key
-
Clone the Repository
git clone <repository-url> cd <repository-directory>
-
Create Virtual Environment
python -m venv env source env/bin/activate
-
Install Dependencies
pip install -r requirements.txt
-
Set Environment Variables
Create a
.env
file in the root directory and add:PINECONE_API_KEY=<your-pinecone-api-key> HUGGINGFACEHUB_API_TOKEN=<your-huggingface-api-token>
-
Run the Files from "src" folder
-
Navigate to the
src
folder:cd src
-
Execute the helper script to download and process embeddings:
python helper.py
It will load the pdf, split the text, and download HuggingFace Embedding model
-
Execute the helper script to download and process embeddings:
python store_index.py
It will store the embeddings as vectore database on Pinecone server
-
-
Run the Flask Application
python app.py
-
Interface
- Conversation History:
A feature to maintain conversation history when explicitly passed as input will be implemented. This will allow:- Seamless continuation of user-bot interactions.
- Retrieval of past interactions for context-aware responses.
- Enhanced user experience, particularly for long or multi-turn conversations.
This project is licensed under the MIT License.
You are free to use, modify, and distribute this software under the terms of the MIT License.
For more details, refer to the LICENSE file.