Churn Analysis: why is it valuable?
I used an open dataset from Kaggle to explore how doing a churn analysis using machine learning models can not only save real money for the company but can also provide insights for strategic decision making. Here are the three business questions I would like to talk about:
- What is the value of the model to a business?
- What types of customers are most likely to churn and why?
- What can we do about it?
Value of the model
Churn is hard to predict for two reasons. One is because only a minority of customers will churn. For example, in this dataset, only 14.5% of the customers churn. The second reason is that there are so many variables that can be factored in the customer’s decision, the answer is not always so straightforward.
The model that I decided to pick can tell us exactly if a customer will churn or not with a 93% accuracy. The best part of this model is that it has a low false-positive rate, meaning that if a customer will indeed churn, our model can identify that 98% of the time.
Because customers that will churn is the minority, therefore in every 1000 customers, there are only 3 customers who will churn but the model didn’t tell us. Without the model, in every 1000 customers, it is very likely that we will miss around 144 customers.
Reasons to churn
It is very intuitive to say that the reasons for customers leaving us was due to the bills were too high or bad customer service. Indeed we see these differences. For example, Graph 2 shows that customers who churn, on average pay 15 dollars more every month than customers who stay. Graph 1 shows that among customers who churn, 50% of them call customer service 1–4 times but among customers who stay, 99% of them call less than 3 times.
However, I want to investigate what led to a high monthly charge, and why do they need to the customer service so many times. I discovered that monthly charge has a positive relationship with data usage, day mins, and overage fee. So I looked that those variables more closely.
Graph 3 shows that a majority of the customers who churn have 0 or very low data usage. On the other hand, graph 4 shows that customers who churn use our calling service more.
One explanation can be that customers who churn have a different need compare to customers who stay. Churn customers use more calling services. However, the way we design our phone plans gives more discounts to data users, and thus, people who use our calling service heavily result in a higher monthly bill.
Graph 5 shows that among customers who do not have a data plan, on average, they call 55 mins more every day, and among those who have a data plan, they call 20 mins less on average.
Another way to testify this assumption is to collect more demographic information about our customers. For example, customers with older age tend to use less data but more calling service. It will be interesting to see if they are also customers who are more likely to churn.
Because it did show that data usage has a positive relationship with the monthly bill, I dug deeper into data usage. Based on experience, a high monthly bill should also relate to the data plan.
Graph 5 and Graph 6 show that, among customers who do not have a data plan, 16.7% churn, and among the ones who have, only 8.7% of them churn (reminder: this dataset has a 14.5% baseline churn rate).
So should we sell our data plan more aggressively to reduce the churn? It turns out that it is not the solution. Graph 7 shows that for customers who have data usage, the churn rate is about the same among customers who have a data plan and not have a plan.
The answer lies in the roaming service. Graph 8 shows that if a customer uses our roaming service, having a data plan increases their rate of stay. In this dataset, 3315/3333 customers use roaming service.
So where do we go from here?
We have concluded that there are two types of customers who are most likely to churn:
- customers who did not purchase a data plan, but still use our roaming service.
- customers who use our calling service above the average.
For the next step, we should collect more information about our customers, for example, demographic information to further verify my analysis.
Based on my analysis, I recommend companies to have a roaming plan or a minutes plan other than the data plan. Customers who churn shown that they have a different need. Therefore, we should provide more choices.
- Full Analysis: Here is the GitHub link to see the full analysis with code.
Note: This post actually homework for Udacity’s Data Scientist Nanodegree