Heart Failure Prediction

Introduction

Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure.

Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.

People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management using counselling and medicines, as appropriate.

Dataset

The dataset used is taken from kaggle and counts 918 observations with 12 features related to each observation.

Some of the data distribution shown unreasonable valuse which must be corrected before training the model.

Data Processing

Some statistical analysis was performed to find the outliers and correct them. The outliers were corrected by replacing them with the mean of the column. Moreover the correlation between variables was also checked to find the most important features, using Pearson Chi Square test:

For the non categorical variables, the correlation was checked using the Weltch two sample t-test:

Models

After the tests the most relevant variables seems to be:

Sex;
ChestPainType;
MaxHR;
ST slope.

To prove the results of the statistichs 2 models are used for the prediction.

Logistic Regression

The logistic regression confirmed the relevance of some variables found with the statistical analysis:

Sex
ChestPainType
FastingBS
Exercise Angina

But it also found that the other variables are not relevant for the prediction.

Oldpeak is significant (low tails masked it before)
MaxHR is not so relevant (the difference in the means are not so important to the model)

ANOVA

The differences between the null deviance and residual deviance shows how our model perform against null hypothesis.

The difference drops at each variable addiction meaning that it is significant for the model.

Bugs and feature requests

Have a bug or a feature request? Please first read and search for existing and closed issues. If your problem or idea is not addressed yet, please open a new issue.

Authors

@Torbidos7

Thanks

Thank you for coming 😝

Copyright and license

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.Rproj.user		.Rproj.user
Difference in means		Difference in means
Modello		Modello
corrections		corrections
general distributions		general distributions
papers		papers
variable of relevance		variable of relevance
.RData		.RData
.Rhistory		.Rhistory
.gitattributes		.gitattributes
Analysis of Heart Failure dataset.pptx		Analysis of Heart Failure dataset.pptx
README.md		README.md
heart.csv		heart.csv
overview.png		overview.png
progetto.R		progetto.R
progetto.Rproj		progetto.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Failure Prediction

Introduction

Dataset

Data Processing

Models

Logistic Regression

ANOVA

Bugs and feature requests

Authors

Thanks

Copyright and license

About

Releases

Packages

Languages

Torbidos7/heart_failure

Folders and files

Latest commit

History

Repository files navigation

Heart Failure Prediction

Introduction

Dataset

Data Processing

Models

Logistic Regression

ANOVA

Bugs and feature requests

Authors

Thanks

Copyright and license

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages