This repository contains a collection of Jupyter notebooks focused on machine learning methods for causal inference. Each notebook provides applied examples using simulated and real data.
The notebooks are organized as follows:
- OLS and Overfitting: basic functionalities of
statsmodels
andpyfixest
for linear regression with the case of overfitting. - Regression with Lasso: prediction of wages using penalized linear regression.
- Classification Models: basics of classification and model evaluation using
sklearn
. - Clustering: This notebooks explores dimensionality reduction techniques like Principal Component Analysis (PCA) and K-means clustering.
- Imbalanced Data: techniques to handle imbalanced data using
imbalanced-learn
. - Tree-based Methods: Basic introduction to tree-based methods like Decision Trees and Random Forest.
- Neural Networks: Basic introduction to neural networks using
sklearn
andpytorch
. - Regression using ML: This notebook is based on a lab from Chapter 9 of the book Causal ML. The goal is to predict wages using non-linear models and stacking.
- Double/Debiased ML: IN PROGRESS
- Heterogeneous Treatment Effects: IN PROGRESS