Is it possible to predict whether an employee will leave the company or not, before hiring?

Is it possible to predict whether an employee will leave the company or not, before hiring?

Play all audios:

Loading...

_Built a Machine Learning model that accurately predicts 91.85%_ One of the great challenges for companies is to hire and retain talent. With an increasingly globalized and competitive


world, having a good and motivated work team means, among other benefits, saving money and time with dismissals, hiring, and training, for example. This project was developed using a


fictional dataset created by IBM data scientists and aims to investigate factors that lead to employee attrition, as well as to develop a Machine Learning model capable of predicting whether


employees tend to leave the company or not. The dataset was taken from the Kaggle portal and can be found at the link:


https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset. I used Python in this project and the complete code can be found on my Github. EXPLORING THE DATA This dataset has 35


features and 1470 lines, let’s start by taking a look at the first five lines: About 84% of employees stayed at the company and 16% left. To better understand the characteristics and


correlations existing in each profile, I divided the data into two groups: those who left and those who stayed. Investigating the data to better understand the profile of employees I was


able to discover some relevant information, such as: · The largest number of employees who left the company is up to 31 years old, especially in the age group between 18 and 21 years old. ·


The daily rate of those who stayed is higher; · As the distance from home to work increases, Attrition is greater; · The critical period that employees most tend to leave is up to 7 years in


the company. From there they tend to stay. · Singles are more likely to leave than married or divorced; · The lower the job level the greater the attrition; · The Sales Representative


position has the highest number of employees who leave the company, reaching almost half; On the other hand, the positions of Research Director and Manager are those with the lowest


Attrition, which is directly related to the salary range. FINDING A CORRELATION BETWEEN VARIABLES Analyzing the correlation between variables is extremely important to achieve a broader view


of the data and how they relate to each other. Let’s take a look at the correlations through the heatmap. The lighter the color the more positive the correlation is. · The job level is


strongly correlated with the total years of work; · Monthly salary has a strong correlation with the job level and the total working years; · Age is strongly correlated with monthly income.


PERFORMING DATA CLEANING The main objective in this step is to ensure that the data is correct, consistent, and usable, identifying any errors in the data, correcting them, deleting them, or


manually processing them as necessary to prevent these failures from occurring again. This database has no null or missing data, which saved me a step in the data cleaning process. After


some procedures such as the treatment of categorical variables and techniques for data normalization using MinMaxScaler, we have the dataset ready for modeling. BUILDING, TRAINING AND


EVALUATING MACHINE LEARNING MODELS As the objective of the project is to predict whether the employee will leave the company or not, this is a classification problem. Among the several


Machine Learning techniques for classification, I chose to use four of them: Logistic Regression Classifier, Random Forest Classifier, K-Nearest Neighbor Classifier, and Artificial Neural


Network Classifier, to then analyze which one obtained the best result. After training and testing the models, the Logistic Regression Classifier achieved the best performance with 91.85%


accuracy. Let’s take a look at the confusion matrix. The model was able to correctly classify the vast majority of employees and incorrectly classify a small number of them. RECOMMENDATIONS


After analyzing the data, we can make some recommendations to mitigate the Attrition of employees of this company, which are: · Create or improve, if it already exists, a career plan for new


employees, especially in the lowest positions; · Special attention should be directed to Sales Representatives, as these professionals represent almost half of those who left the company.


In addition to a career plan as mentioned above, a variable remuneration based on short, medium and long term goals, in addition to awards and close support by a manager with a motivating


and leadership profile will help to contain the attrition; · Home office shifts can be a solution to try to mitigate the impact of distance from home to work on employee attrition; ·


Training and courses for employees, especially those working at lower levels. > Thanks for reading!!