This project aims to provide an in-depth analysis of prosecution outcomes within the context of the Crown Prosecution Service. By utilizing a comprehensive dataset of CPS cases from 2014 - 2018, this project delves into the factors that influence prosecution outcomes and sheds light on the various aspects contributing to case resolutions. Outcomes are broken down into two categories, convictions and unsuccessful outcomes. Convictions comprise guilty pleas, convictions after trial, and cases proved without the defendant. Unsuccessful outcomes represent all outcomes other than a conviction, comprising discontinuances and withdrawals, discharged committals, dismissals and acquittals, and administrative finalizations. The objective is to extract meaningful insights from this data to gain a deeper understanding of the CPS system and its impact on the administration of justice.
-
Dataset Exploration: Explore a diverse and extensive dataset of CPS cases, encompassing different crime types, geographical regions, and periods.
-
Statistical Analysis: Using statistical techniques to identify patterns and trends in prosecution outcomes, unveiling significant factors affecting case resolutions.
-
Visualizations: Present the findings through intuitive visualizations, such as charts, graphs, and maps, to enhance comprehension and facilitate knowledge sharing.
-
Machine Learning Models: Develop predictive models to forecast prosecution outcomes based on various case attributes, aiding decision-making processes.
-
Comparative Analysis: Conduct comparative analysis across different jurisdictions, legal systems, or timeframes to identify similarities and differences in prosecution outcomes.
- R Programming Language
- Data cleaning, Data Manipulation, Data Wrangling, and Feature Engineering.
- Exploratory Data Analysis (Univariate, Bivariate, Multivariate Analysis, and Visualization).
- Statistical methods (Hypothesis testing, Covariance, and Correlation).
- ggplot
- Tidyverse
- Hash
- Devtools, and others.
- Machine Learning prediction models
- Linear Regression techniques
- K-Means Clustering techniques
- Classification techniques
- Logistic Regression models
- Random Forest techniques