Skip to content

vladimir-abramov/data_science_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Yandex.Praktikum Data Science Projects

A repository containing a portfolio of Data Science projects completed by Vladimir Abramov during the training courses at Yandex Praktikum.

Certificate of Completion of the Course (English Version)

Project Description Used libraries
Churn-prediction model Based on contractual, personal, and consumption data of content by subscribers, a model was developed to predict customer churn for a telecommunications operator. Python, Pandas, NumPy, Matplotlib, Seaborn, Phik, PipeLine, Time, StandardScaler, OneHotEncoder, SciKitLearn, CatBoost, LightGBM
Age-prediction model by photo Based on a dataset of pre-labeled photos with the specified real age, a model was developed to determine the approximate age of the buyer from the photo taken in the checkout zone, for targeting the offered goods and controlling the honesty of the cashiers when selling alcohol. Python, Pandas, NumPy, Matplotlib, Seaborn, Keras, ResNet, Adam
Startup Investments In the project, we need to analyze data on funds and investments and write SQL queries to the database. SQL
Toxicity-detection model Based on the labeled data of comment toxicity, a model was constructed to determine the toxicity of comments for products in an online store. Python, Pandas, NumPy, Time, Tqdm, SpaCy, NLTK, SciKitLearn, LightGBM, XGBoost
Forecasting taxi orders Based on historical data of taxi orders in airports, a model was constructed that predicts the number of taxi orders for the next hour to attract more drivers during peak demand. Python, Pandas, NumPy, Time, Tqdm, Statsmodels, Matplotlib, Seaborn, SciKitLearn, CatBoost, LightGBM, XGBoost
A model for determining the cost of cars Based on historical data of technical characteristics, equipment, and service prices for used car sales, a model for a mobile application has been constructed which can determine the market value of a car. Python, Pandas, NumPy, Time, Matplotlib, Seaborn, SciKitLearn, CatBoost, LightGBM
An algorithm for protecting customers' personal data Based on the personal data of customers of the insurance company, an algorithm for data encryption using matrix operations has been developed. Python, Pandas, NumPy, SciKitLearn
Recovering gold from ore Based on data with parameters of gold extraction and purification from gold-bearing ore, a model was constructed which predicts the gold recovery coefficient for production optimization, so as not to launch a business with unprofitable characteristics. Python, Pandas, NumPy, Matplotlib, Seaborn, SciKitLearn, CatBoost
The selection of a location for the new oil well Based on data from oil samples in three regions, each with 10,000 deposits and measured quality of oil and volume of reserves, a model was constructed that determines the region where extraction will bring the highest profit. Python, Pandas, NumPy, SciKitLearn
Prediction of customer churn for the bank Based on historical data on customer behavior and contract termination with the bank, a model was constructed to predict the possibility of a customer leaving the bank. Python, Pandas, NumPy, Matplotlib, Seaborn, SciKitLearn
Recommendation of tariffs Based on data about the activity of mobile operator subscribers, a model was constructed to select an appropriate tariff. Python, Pandas, NumPy, Matplotlib, Seaborn, SciKitLearn
The forecast of sales in an online computer game store Based on the data of the sales of an online computer game store, identify the patterns that determine the success of the game in order to make a bet on a potentially popular product and plan advertising campaigns. Python, Pandas, NumPy, SciPy, Matplotlib, Seaborn
Statistical analysis of data Based on data about the activity of mobile operator subscribers, their behavior was investigated. Based on the research, a conclusion was made about the profitability of the tariff grid for the operator. Python, Pandas, NumPy, SciPy, Matplotlib, Seaborn, Functools
Investigation of apartment sale announcements Based on data from Yandex Real Estate - an archive of advertisements for the sale of apartments in St. Petersburg and neighboring settlements over the years - the dependence of the price per square meter on the characteristics of the apartment, the characteristics of the house, the distance to the center of St. Petersburg, and the distance to the nearest airport has been studied. Python, Pandas, NumPy, Geopy, Matplotlib, Seaborn
Investigation of borrowers' reliability Based on data from the bank - statistics on the creditworthiness of customers - the influence of marital status and the number of children of customers on the fact of loan repayment in time was studied. Python, Pandas
Yandex. Music Based on data from a music streaming service, hypotheses about musical preferences in St. Petersburg and Moscow have been tested. Python, Pandas