Attrition Data Analysis

Created by Stuart Miller.

Project Summary

In an effort to reduce employee attrition, data has been provided on employees. The objective is to analyze the data to determine what factors (if any) correlate to attrition and monthly salary. The following will be reported:

  • Top 3 factors assocatied with employee turnover
  • A model for predicting employee attrition
  • A model for predicting monthly salary

Most Important Attrition Factors

The top three factors associated with attrition that we identified are

  • Job Involvement - Lack of involvement
  • Work-Life Balance - Lack of work-life balance
  • Job Level - Low job levels

Attrition Model

Attrition was modeled with naive bayes. The target was to create a model with at least 60% specificity and 60% sensitivity on a 80/20 data split. A large number of variables that appeared to be significant for attrition prediction were included. Based on the preformance of the model, the goals for sensitivity and specificity were met.

Model Performance

SpecificitySensitivity
62%85%

Validation Results

Attrition IdentifiedAttrition Missed
2314

Income Model

Income was modeled with linear regression (OLS). From data exploration, we suspected that monthly income was correlated to total working years, age, years at the company, years in current role, and years with current manager. The following model was used. The model requires that all categorical variables have the same slope between MonthlyIncome and TotalWorkingYears becasue no interaction terms were included. The categorical variables, JobLevel and JobRole, only provide a difference in intercept for the regression between MonthlyIncome and TotalWorkingYears. Estimates for the model parameters are shown in the table below.

\[\mu \{MonthlyIncome\}=\hat{\beta}_0+\hat{\beta}_1(JobLevel)+\hat{\beta}_2(JobRole)+\hat{\beta}_3(TotalWorkingYears)\]

From a fit of the model, we found that for an incease in total working years of one year there is an associated increase in mean monthly income of $44.46. The change in intercept for each job level appears to be significantly different (level 1 was used for reference). The change in intercept for each job role appears to be significantly different except for manufacturing director and sales executive. There is not sufficent evidence to suggest that the intercepts for manufacturing director and sales executive are significantly different than the reference (healthcare representative).

Estimate of Model Parameters

VariableEstimatep-value
(Intercept)3561.09< 2e-16
Total Working Years44.468.04e-07
Job Level 21742.46< 2e-16
Job Level 34893.21< 2e-16
Job Level 48191.81< 2e-16
Job Level 510960.61< 2e-16
Job Role: Human Resources-984.800.00084
Job Role: Laboratory Technician-1163.751.75e-08
Job Role: Manager3436.58< 2e-16
Job Role: Manufacturing Director153.280.40279
Job Role: Research Director3562.03< 2e-16
Job Role: Research Scientist-981.122.40e-06
Job Role: Sales Executive-40.240.79983
Job Role: Sales Representative-1220.671.71e-06

Additional Information