#objective :
This project aims to build prediction model using various regression models to predicti the score of a student based on various factors and then create a flask app hosted and deployed through AWS beanstalk and AWS codepipeline
The code for the app can be divided into the folowing parts
- Components : data ingestion, data transformer, model trainer
- Pipelining : predict and train pipline, where I have called various components to carry out prediction
- Pickle files : model and preprocessor (helping in caching the model post Model building and training )
-The code requires the following dataset : https://github.com/ritikdhame/Student_score_prediction_app_regression_models/tree/main/notebook/data -The dataset contains : student scores ( math_score reading_score writing_score ) and coressponding variables such as gender race_ethnicity parental_level_of_education lunch test_preparation_course
To run the code, you need to install the following libraries:
- pandas
- numpy
- seaborn
- matplotlib
- scikit-learn
- catboost
- xgboost
- Flask You can install them using pip or conda commands.
- Lasso & Ridge
- Decision tree
- Random Forest
- XGBoost
- AdaBoost
- CatBoost
- I have used Rsquare as the loss function
To run the code, you need to execute the following steps:
- Import the required libraries.(pip isntall requirements.txt)
- Run application.py to deploy it locally on the terminal sutied to the device
- In case of running it AWS just fork the repo and then run hosting it on Beanstalk and utilizing codepipline to load the repo.
The code is commented and documented for better understanding and readability.