Employee Attrition Analysis | Bineet's Portfolio

About the Project

This project addresses the critical issue of employee attrition using Data Science and Analytics. By analyzing IBM HR datasets, we identify patterns leading to attrition and predict potential cases, enabling proactive retention measures.

The solution achieves 99.8% accuracy in predicting attrition using advanced machine learning techniques.

Project Goals

Analyze employee profiles and identify attrition patterns
Build predictive models to flag at-risk employees
Provide actionable insights for HR decision-making
Compare performance of multiple ML algorithms

Dataset

IBM HR Analytics Dataset with 24,000 employee records containing:

Categorical Features: Department, JobRole, OverTime, BusinessTravel, etc.
Numerical Features: Age, MonthlyIncome, JobSatisfaction, YearsAtCompany, etc.
Target Variable: Attrition (Yes/No)

Dataset Link

Methodology

Data Preprocessing

Exploratory Data Analysis with visualization
Handling missing values and feature engineering
Label encoding and one-hot encoding for categorical variables
SMOTE for handling class imbalance

Machine Learning Models

Evaluated 8+ algorithms including Logistic Regression, Random Forest, and XGBoost
Feature selection using RFE and SequentialFeatureSelector
Hyperparameter tuning for optimal performance

Results

Train Accuracy: 99.98%
Test Accuracy: 99.81%
ROC-AUC Score: 0.99

XGBoost outperformed other models with near-perfect accuracy in identifying attrition cases.