Neural Networks | SkilledProfessors

Churn Project

Diana Mbiad-Galan; March 2, 2024Contents / Agenda

Executive SummaryBusiness Problem Overview and Solution ApproachEDA ResultsData PreprocessingModel Performance SummaryAppendix

Executive SummaryEDA Key Findings:Customer churn at the bank is influenced by a multifaceted set of factors, rather than a single predominant cause. This is evidenced by the low correlation coefficients across individual variables, suggesting that a more nuanced model that considers the interplay of various elements may be necessary for accurate churn prediction.These are the factors that may impact churn the most:
Geographic location emerges as a notable factor, with a disproportionate number of customers from Germany ceasing their relationship with the bank compared to those from Spain and France, hinting at regional differences in customer satisfaction or service perception.Additionally, gender appears to play a role, with a higher churn rate observed among female customers. This could point to differing financial needs or experiences that are not being equally met across genders.Moreover, customer engagement significantly affects churn; inactive members show a greater tendency to leave the bank, underscoring the importance of regular interactions and consistent value delivery to retain clientele.Age also presents as a contributing factor; older customers are more likely to churn, possibly reflecting changing financial priorities or service
expectations with age.Recommendations:
Adopt a holistic approach in its retention strategies, considering demographic, behavioral, and regional variables to effectively reduce customer attrition.
Executive SummaryModel Key Findings:I conducted an analysis of various neural network configurations to develop a robust churn prediction model for the bank. The objective was to identify a model that not only accurately predicts customer churn but also generalizes well to unseen data.The following models were evaluated: Neural Network with SGD Optimizer, Neural Network with Adam Optimizer, Neural Network with Adam Optimizer and Dropout, Neural Network with Balanced Data (SMOTE) and SGD Optimizer, Neural Network with Balanced Data (SMOTE) and Adam Optimizer, Neural Network with Balanced Data (SMOTE), Adam Optimizer, and Dropout
Models utilizing SMOTE to balance the dataset demonstrated superior recall, crucial for identifying at-risk customers.The introduction of the Adam optimizer significantly improved model recall compared to SGD.The inclusion of dropout regularization further enhanced the model’s ability to generalize, as indicated by the reduced gap
between training and validation performance.
Final Model Selection: The Neural Network with SMOTE, Adam Optimizer, and Dropout emerged as the top-performing model. It achieved the highest recall rates while maintaining a satisfactory balance between training and validation performance, indicating a good fit with controlled overfitting. This model’s ability to generalize was further evidenced by its performance on the test data, with a relatively low rate of false negatives.
Recommendations:
Deploy the Neural Network with SMOTE, Adam Optimizer, and Dropout for the churn prediction task.Monitor the model’s performance regularly to ensure consistency and adjust as necessary.
Business Problem Overview and Solution ApproachThe objective is to develop a predictive model using neural networks to accurately forecast the likelihood of a customer discontinuing their relationship with the bank and switching to a competing service provider within the next six months.Understanding the key factors that influence a customer’s decision to leave is crucial.This insight will enable the bank’s management to focus on improving specific aspects of their service to enhance customer retention.The ultimate goal of this project is to leverage machine learning to identify at-risk customers proactively, allowing the bank to implement targeted retention strategies effectively. EDA: Univariate AnalysisEDA Results – Credit Score
Median and Distribution: The median credit score is slightly above 650, indicating the central tendency of the dataset. The histogram shows a normal- like distribution with most of the data clustered around this median value.Lower End Outliers: There are outliers at
the lower end of the credit score range, as indicated by individual points below the lower whisker of the box plot. This suggests that a small subset of customers have credit scores significantly lower than the majority.
High-End Peak: There is an unusual
peak at the highest end of the credit score range on the histogram. This could indicate a large number of customers with excellent credit scores, or it could be a result of data collection peculiarities.EDA Results – Age
Age Distribution: The histogram illustrates that the age distribution is right-skewed, indicating a larger number of younger customers compared to older ones. The majority of customers are clustered in the 30-40 age range, with the count gradually decreasing for higher ages.Median Age: The bank’s median
customer is in their late thirties.
Outliers: There are numerous outliers on the upper end of the age range in the
box plot, suggesting that there are several customers who are significantly older than the main customer group.EDA Results – Balance
Substantial Zero Balance Count: There is a significant peak at the zero balance mark on the histogram. This indicates a large number of customers have a zero or very low current account balance. This could represent a specific customer segment, such as new accounts or dormant accounts.
EDA Results – Estimated Salary
Uniform Distribution: The histogram indicates a uniform distribution of estimated salaries across the entire range. This uniformity suggests that within this dataset, the salaries are spread out evenly from the lowest to the highest values.Median Salary: The box plot shows that the
median salary (indicated by the triangle inside the box) is centrally located, which is consistent with the uniform distribution observed in the histogram. This would imply that the median salary is roughly equal to the mean salary, a characteristic of a uniform distribution.
No Significant Outliers: Neither the box plot
nor the histogram show outliers.EDA Results – Exited, Geography, and Gender
The bar chart indicates that 79.6% of customers have remained with the bank, while 20.4% have exited, suggesting that the bank maintains a majority of its customers but also faces a significant churn rate that could be of concern.The bar chart shows the distribution of the bank’s customers by geography, with the majority located in France (5014 customers), and a relatively equal but smaller number of customers in Germany (2509 customers) and Spain (2477 customers).The bar chart shows a slight gender disparity among the bank’s customers, with males (5457) slightly outnumbering females (4543).EDA Results – TenureThe bar chart presents the distribution of customers’ tenure with the bank, showing a relatively even distribution for durations from 1 to 9 years, but significantly fewer customers with a tenure of 0 or 10 years
EDA Results – Number of CustomersThere is nearly even split betweenMajority of customers have between 1-2 products with the bankMajority of customers have a credit cardactive (5151) and inactive (4849) members, indicating a balanced distribution of engagement among the bank’s customers. EDA: Bivariate AnalysisEDA Results – Correlation
Most variables have very low correlation with each other, with no strong direct
relationships evident.
The most notable correlation is between ‘Age’ and ‘Exited.Additionally, there is a negative
correlation between ‘NumOfProducts’ and ‘Balance’, suggesting that customers with more products tend to have lower balances.Overall, the lack of high correlation coefficients suggests that no single factor strongly predicts customer churn on its own, and a combination of factors may need to be considered for more accurate predictions.Exited & Location, Gender, Has Credit and Active Member
A higher proportion of the bank’s customers in Germany have exited compared to those in Spain and France, indicating geographic variation in customer churn.A higher proportion of the bank’s customers who have exited are Female compared to Male.Having a credit card does not significantly affect the proportion of customers who have exited the bank.Inactive members have a higher rate of exiting the bank compared to active members.Exited & Credit Score, Age, and Tenure
The boxplot comparison of Credit Score and Exited shows that both customers who stayed and those who exited have a similar range of credit scores, with the median score being roughly equalCustomers who exited tend to be older, as indicated by a higher median age and more upper age outliers, compared to those who stayed.Customers who exited the bank have a wider range of tenure with a similar median tenure compared to those who stayed. Exited & Balance and Number of ProductsThe boxplot shows that both customers whoAccount balance does not appear to be a strong predictor of customer churn, since both median values and distribution ranges are quite similar across both groups.stayed and those who exited typically have a similar number of products from the bank; however, there are a few outliers among customers who stayed, indicating some with a higher number of products..Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited. Data Pre ProcessingData Preprocessing
Train validation-test SplitDummy variable creation for GeographyData Normalization: Since all the numerical values are on a different scale, I scaled all the numerical values to bring them to the same scale.There were no missing value
Model Performance SummaryModel Performance SummaryModel Evaluation Criteria:
B”si⭲css Impacť or Ialsc Ncgaťi:cs a⭲d Ialsc Posiťi:cs:Ïalsc Ncgati:cs (ÏN): Pícdicťi⭲g a c”sťomcí will ⭲oť ck”í⭲ wkc⭲ ťkcQ acť”allQ do ca⭲ bc moíc cosťlQ ťka⭲ ralsc posiťi:cs bcca”sc ťkc ba⭲k loscs ťkc oppoíť”⭲iťQ ťo ícťai⭲ ťkc c”sťomcí ťkío”gk ťaígcťcd i⭲ťcí:c⭲ťio⭲s.Ïalsc Positi:cs (ÏP): Pícdicťi⭲g a c”sťomcí will ck”í⭲ wkc⭲ ťkcQ acť”allQ wo⭲’ť ca⭲ lcad ťo “⭲⭲cccssaíQ spc⭲di⭲g o⭲ ícťc⭲ťio⭲ crroíťs, wkick is lcss cosťlQ b”ť sťill “⭲dcsiíablc.I dccidc ťo “sc íccall roí modcl pcíroíma⭲cc as ťkis mcťíic i⭲dicaťcs ťkc píopoíťio⭲ or acť”al posiťi:cs ťkať wcíc coííccťlQ idc⭲ťiricd. MQ goal is ťo mi⭲imizc ťkc ralsc ⭲cgaťi:cs. Higk íccall is cí”cial i⭲ ťkis scc⭲aíio bcca”sc iť mca⭲s ťkc modcl is crrccťi:c ať idc⭲ťirQi⭲g c”sťomcís ať íisk or ck”í⭲, allowi⭲g ťkc ba⭲k ťo i⭲ťcí:c⭲c.
Model Building:Neural Network with SGD OptimizerModel LossThe loss function graph depicts a neural network model’s training and validation loss over 100 epochs.Convergence: Both training and validation loss decrease sharply and begin to converge as the number of epochs increases, which suggests that the model is learning and improving its predictions over time.Overfitting: There is no clear sign of overfitting, as the validation loss continues to decrease alongside the training loss without increasing or plateauing, which is typically indicative of a model that generalizes well to unseen data.Stability: The model reaches stability relatively quickly, with both losses leveling out and showing minimal decrease past approximately 20 epochs. This indicates that additional training beyond this point does not significantly improve the model, and the chosen architecture and hyperparameters are effective for this task.Model Recall
Initial Spike: There’s an initial spike in recall for both the training and validation datasets.Training Recall: The recall on the training set rises sharply and then plateaus, indicating that the model has learned to identify
most of the true positives in the training data. The relatively smooth line suggests that the model is stable after the initial learning phase.
Validation Recall: The recall on the validation set also increases
but with more variability compared to the training recall. Despite the fluctuations, it follows an upward trend, which indicates that the model is improving its ability to generalize to unseen data.Confusion Matrix for the Train Data
The confusion matrix on the train data indicates that the model has a high true negative rate (5005 correctly predicted non-churners), but it struggles with identifying churners, with a large number of false negatives (1136 customers who churned but were not identified by the model) and a very low true positive rate (168 correctly predicted churners).This suggests that while the model is conservative in
predicting churn, it is missing a significant number of at- risk customers.Confusion Matrix for the Validation Data
The confusion matrix for the validation data shows that the model has a high true negative rate (78.31%) but a low true positive rate (1.88%), indicating it struggles to correctly identify customers who will churn.The model also has a relatively high false negative rate
(18.50%), suggesting many customers who will churn are being incorrectly predicted as not churning, which could be critical in a churn prediction context. The false positive rate is low (1.31%), indicating the model is conservative in predicting churn.
Overall, the model may require adjustments to improve
its recall. Model Performance Improvement:Nc”íal Ncťwoík wiťk Adam OpťimizcíModel LossThe model loss graph indicates that the neural network with the Adam optimizer is learning effectively, as shown by the steady decline of the training loss.However, the validation loss decreases initially but then fluctuates and generally trends upward, suggesting the model may be beginning to overfit to the training data as it does not generalize as well to the validation set after around 20 epochs.Model RecallThe recall graph for the neural network model depicts the following:
Training Recall Improvement: The recall on the training
data shows a consistent upward trend, indicating that the model is getting better at correctly identifying all relevant cases over the epochs.
Validation Recall Variability: The recall on the validation set
is more volatile, with significant fluctuations.
Generalization Gap: There is a noticeable gap between the training and validation recall, with the training recall being higher. This gap could point to the model performing better on the training data than on the unseen validation data, which could be a sign of overfitting, especially since the validation recall does not reach the same level of performance.
Confusion Matrix for the Train Data
The confusion matrix for the training data shows that the model has a high true negative rate (76.19% of the total data) but a relatively lower true positive rate (13.81% of the total data), indicating it is better at identifying customers who will not churn than those who will.The false negative rate is considerable (6.56%), which could
be a concern as these are customers who are predicted to not churn but actually do, potentially leading to missed opportunities for the bank to intervene.
The false positive rate is 3.44%, representing customers who
are incorrectly predicted to churn, which could lead to unnecessary retention efforts.
Overall, the model may benefit from improvements to better
capture the customers at risk of churn.Confusion Matrix for the Validation Data
The confusion matrix for the validation data shows that the model correctly identified 74.50% of the non-churners (true negatives) and 10.38% of the churners (true positives), but there are also false negatives (10.00%) and false positives (5.12%), indicating that while the model is fairly good at predicting non-churners, it struggles more with correctly identifying churners.
Model Performance Improvement: Nc”íal Ncťwoík wiťk Adam Opťimizcí a⭲d Kíopo”ťModel LossThe model loss graph with the Adam optimizer and dropout applied shows that both the training and validation loss decrease over time and converge well, suggesting that the model is learning and generalizing effectively. The introduction of dropout appears to have helped in managing overfitting, as the validation loss remains close to the training loss throughout the training process without any significant divergence. The relative stability of the validation loss also indicates that the model is likely to perform more consistently on unseen data.Model Recall
There is an increase in recall on the training data indicating an improvement in the model’s ability to correctly classify the positive class. The subsequent up-and-down patterns suggests that the model is experiencing some variance in its predictions as it continues to learn from the training data.The validation recall converging with the training recall and
following a similar pattern, albeit with more pronounced fluctuations, implies that the model is generalizing to the validation data. However, the greater amplitude of the peaks and valleys in the validation recall indicates less stability in the model’s performance on unseen data.
There is not a big gap in performance from the training and
validation data. However, since the training data continues to show less fluctuation in recall compared to the validation data, it may suggest that the model could be starting to overfit.Confusion Matrix for the Train Data
The model correctly predicted ‘not exited’ (True Negative) for 4934 cases, which is 77.09% of the total predictions.It incorrectly predicted ‘exited’ (False Positive) for 162
cases, which is 2.53% of the total.
It incorrectly predicted ‘not exited’ (False Negative) for 621 cases, which is 9.70% of the total.It correctly predicted ‘exited’ (True Positive) for 683
cases, which is 10.67% of the total.
This matrix suggests that the model is relatively conservative at predicting ‘exited’ and tends to predict ‘not exited’ more often. The relatively high number of False Negatives compared to True Positives indicates that the model’s recall could be improved, as it’s missing a significant number of actual ‘exited’ cases.
Confusion Matrix for the Validation Data
True Negative (TN): 1223 customers were correctly predicted as not exited, comprising 76.44% of predictions.False Positive (FP): 51 customers were incorrectly predicted
as exited, representing 3.19% of predictions.
False Negative (FN): 175 customers actually exited but were predicted as not exited, making up 10.94% of predictions.True Positive (TP): 151 customers were correctly predicted to have exited, which is 9.44% of predictions.This matrix indicates the model is better at predicting
customers who will not exit than those who will.
The model still misses a significant number of customers who are likely to churn. The model’s precision and recall for predicting ‘exited’ customers could be improved.
Model Performance Improvement:Nc”íal Ncťwoík wiťk Bala⭲ccd Kaťa (bQ applQi⭲g SMOľE) a⭲d SGK OpťimizcíModel LossThe model loss plot for the neural network trained on balanced data using Synthetic Minority Over-sampling Technique (SMOTE) and the SGD optimizer shows both training and validation loss decreasing over epochs, which is a good sign of model learning.The convergence of training and validation loss indicates that the model is generalizing well to unseen data.There is no significant divergence between the two curves, which suggests that overfitting is not occurring.This is a positive outcome, especially given that balancing the dataset can sometimes lead to overfitting due to the synthetic nature of the oversampled data.Model Recall
Training Recall: The recall on the training set starts high and shows a sharp drop, which then gradually increases throughout the epochs. This pattern indicates an initial overfitting to the training set, which corrects itself as the model begins to generalize better with further training.Validation Recall: The validation recall initially follows
the training recall closely, indicating that the model is learning general patterns from the data.
Convergence and Stability: As the epochs increase, the
recall on both sets shows stability, with the validationrecall exhibiting a steady increase, albeit with some fluctuations.
Gap Between Training and Validation: There is a
noticeable gap between the training and validation recall, which remains consistent. This gap does not widen, which is a positive sign, indicating that the model maintains its generalization ability as it learns.Confusion Matrix for the Train Data
True Negative (TN): 3741 cases were correctly predicted as non-churn (0), which is 36.71% of the predictions.False Positive (FP): 1355 cases were incorrectly predicted as churn (1), which is 13.29% of the predictions.False Negative (FN): 1367 cases were incorrectly predicted as non-churn (0), which is 13.41% of the predictions.True Positive (TP): 3729 cases were correctly predicted as churn (1), which is 36.59% of the predictions.
This matrix indicates that the model has a balanced performance in terms of false positives and false negatives.Confusion Matrix for the Validation Data
The model on the validation data predicted the on-churn correctly in 933 cases (58.31%).It incorrectly predicted churn for 341 cases (21.31%).The model incorrectly predicted non-churn in 102 cases
(6.38%).
It correctly identified churn in 224 cases (14.00%).
This matrix indicates the model has a higher accuracy for predicting non-churn than churn. The false positive rate is quite high, which may result in unnecessary retention efforts.The model’s recall for the churn class could be improved, as it missed 102 actual churn cases. Model Performance Improvement:Nc”íal Ncťwoík wiťk Bala⭲ccd Kaťa (bQ applQi⭲g SMOľE) a⭲d Adam OpťimizcíModel Loss
Training Loss: The training loss shows a significant decrease initially, which levels off as epochs increase. This is indicative of the model quickly learning from the training data and then making incremental improvements as it begins to converge to a minimum loss value.Validation Loss: The validation loss decreases alongside the training
loss initially, suggesting that the model is generalizing well to the unseen data. However, the validation loss exhibits much more fluctuation, which can be indicative of the model’s sensitivity to the validation data set’s variation.
Divergence Between Losses: While both losses decrease, the gap
between the training and validation loss suggests that the model is fitting the training data better than the validation data. Since gap continues to widen, it could be an early sign of overfitting.Model Recall
Stable High Training Recall: The recall on the training set quickly rises to a high level and remains consistently high throughout the training process. This indicates that the model has a strong ability to correctly identify the relevant class in the training data.Volatile Validation Recall: The validation recall is more volatile
and substantially lower than the training recall. This suggests that while the model can recognize the positive class in the training data, it struggles to generalize this recognition to new, unseen data.
Significant Gap Between Training and Validation: The
noticeable gap between training and validation recall, where the training recall is much higher, may indicate overfitting. The model is likely learning specific patterns in the training data that do not generalize well to the validation set.Confusion Matrix for the Train Data
There is a large number of true negatives (TN): 4540 cases where the model correctly predicted the non-churn class, which makes up 44.54% of the total cases.There is a smaller number of false positives (FP): 556 cases
where the model incorrectly predicted the churn class, which is 5.46% of the total cases.
There is a considerable number of false negatives (FN): 987
cases where the model incorrectly predicted the non-churn class, accounting for 9.68% of the total cases.
There is a substantial number of true positives (TP): 4109
cases where the model correctly predicted the churn class, representing 40.32% of the total cases.This matrix indicates a good ability of the model to identify both classes, with a particularly strong performance in identifying true positives, which is crucial in churn prediction to enable timely intervention strategies.Confusion Matrix for the Validation Data
True Negative (TN): The model correctly predicted the ‘non- churn’ class for 1104 cases, accounting for 69.00% of the total predictions.False Positive (FP): There were 170 cases where the model
incorrectly predicted ‘churn,’ which is 10.62% of the predictions.
False Negative (FN): The model incorrectly labeled 130 cases
as ‘non-churn’ when they actually did churn, comprising 8.12% of the predictions.
True Positive (TP): The model correctly identified 196 cases as
‘churn,’ which represents 12.25% of the predictions.This matrix suggests that the model is relatively good at predicting non-churn but may be less effective at correctly identifying customers who will churn. Model Performance Improvement:Nc”íal Ncťwoík wiťk Bala⭲ccd Kaťa (bQ applQi⭲g SMOľE), Adam Opťimizcí, a⭲d Kíopo”ťModel Loss
Training and Validation Loss: The training loss continues to decline over the 100 epochs, which is a good sign that the model is learning from the training data. However, the validation loss decreases to a point and then fluctuates, which suggests that the model may not be generalizing as effectively to the validation set after a certain number of epochs.Potential Overfitting: The divergence that begins to appear
between the training and validation loss could be an early sign of overfitting. The model is performing better on the training data than the validation data, which might indicate it is starting to memorize the training data rather than learning generalizable patterns.Model Recall
Training Recall: The training recall begins at a higher value and shows a rapid increase, stabilizing at a high level of recall fairly early in the training process. This suggests that the model is effectively learning to identify the relevant patterns for the positive class in the training data.Validation Recall: The validation recall starts lower than the
training recall and increases with significant volatility. The pronounced fluctuations indicate that the model’s ability to generalize to the validation data is less stable.
Convergence and Divergence: There is no convergence seen
here; the training and validation recall do not come together, indicating a persistent gap. This gap suggests that while the model is learning well on the training data, it is not performing as well on the validation set, which could be a sign of overfitting.Confusion Matrix for the Train Data
True Negatives (TN): The model correctly predicted the majority of the non-churn class (0) with 4194 cases, equating to 41.15%.False Positives (FP): There are 902 cases where the
model incorrectly predicted churn, which is 8.85% of the cases.
False Negatives (FN): The model incorrectly predicted
non-churn for 741 cases, which is 7.27%.
True Positives (TP): The model correctly predicted churn for a significant number of cases (4355), accounting for 42.73%.
This matrix suggests that the model is quite effective at identifying both churn and non-churn cases, with a particularly strong performance in detecting true positives (churn).Confusion Matrix for the Validation Data
True Negatives (TN): 1025 instances were correctly predicted as class 0, which is 64.06% of the total.False Positives (FP): 249 instances were incorrectly predicted as class 1 when they were actually class 0, representing
15.56%.
False Negatives (FN): 96 instances were incorrectly predicted
as class 0 when they were actually class 1, accounting for 6.00%.
True Positives (TP): 230 instances were correctly predicted as
class 1, which is 14.37% of the total.This suggests that the model is more conservative, tending to predict class 0.
Model Performance Summary
Training Validation set train_metric_df – performance performance valid_metric_df comparison comparisonRecallNN with SGD0.1288340.0920250.036810NN with Adam0.6779140.5092020.168712NN with Adam & Dropout0.5237730.4631900.060583NN with SMOTE & SGD0.7317500.6871170.044634NN with SMOTE &Adam0.8063190.6012270.205092NN with SMOTE, Adam & Dropout0.8545920.7055210.149070
NN with SGD: Shows a relatively low recall difference between train and validation sets, indicating good generalization but lower overall recall scores, suggesting the model may not be as effective at identifying all positive cases. NN with Adam: This model significantly improves recall on both sets compared to SGD, with a relatively large gap between training and validation performance, which might indicate some overfitting.NN with Adam & Dropout: Adding dropout to the NN with Adam improves generalization (a smaller performance gap) and maintains a good recall rate, indicating better model robustness.NN with SMOTE & SGD: The application of SMOTE with SGDoptimizer shows a high recall on both sets with a small performance gap, indicating effective generalization and a strong ability to identify positive cases.NN with SMOTE & Adam: This model achieves the highest recall on the training set but shows a large drop in the validation set, suggesting potential overfitting despite the high recall capability.NN with SMOTE, Adam & Dropout: This combination shows the highest recall on both training and validation, with a moderate performance gap. The use of dropout with SMOTE and Adam seems to provide a good balance of high recall and generalization.Model Performance SummaryConsidering both recall and generalization, NN with SMOTE, Adam & Dropout stands out as the most balanced model. It achieves the highest recall while maintaining a reasonable gap between training and validation, indicating a good fit without significant overfitting.
True Negatives (TN): 1266 cases were correctly predicted as non-churn, comprising 63.30% of total cases.False Positives (FP): 327 cases were incorrectly predicted as churn, which is 16.35% of the cases.False Negatives (FN): 107 cases were incorrectly predicted as non-churn, accounting for 5.35% of the cases.True Positives (TP): 300 cases were correctly predicted as churn, representing 15.00% of the cases.
This matrix shows that the model has a good rate of correct predictions for the non-churn class and a reasonable rate for the churn class.The false positive rate is a bit high, which could result in unnecessary retention efforts. However, the false negative rate is relatively low, meaning the model is quite good at catching most customers who are likely to churn.Actionable Insights and RecommendationAfter a thorough analysis of the exploratory data analysis (EDA) and various neural network model configurations, the following actionable insights and recommendations can be provided:Actionable Insights:
Geographic Influence: There is a clear geographic pattern in customer churn, with Germany showing a higher churn rate. The bank should investigate regional service delivery and market competition to tailor strategies that address local customer needs.Gender-Based Trends: The higher churn among female customers suggests potential gaps in service or product offerings that resonate with
women. Tailored financial products or marketing strategies could be developed to better serve and retain female customers.
Engagement Levels: The link between customer activity and churn implies that engaging customers with regular, relevant interactions could be crucial in reducing churn. Programs to boost customer activity, such as personalized offers or financial advice, should be considered.Age-Related Patterns: The older demographic is more prone to churn, possibly due to unmet service expectations or changing financial needs.
The bank could introduce age-specific advisory services or retirement planning products to retain older customers.Recommendations:
Comprehensive Retention Strategy: Implement a data-driven retention strategy that integrates demographic, behavioral, and regional data to address the multifaceted nature of customer churn.Model Deployment and Monitoring: The Neural Network with SMOTE, Adam Optimizer, and Dropout should be deployed for predicting churn.
It is crucial to continuously monitor its performance over time, recalibrating the model if there’s a shift in customer behavior patterns or market conditions.
Further Research and Development: Ongoing research into additional factors influencing churn should be conducted. Qualitative data, such as
customer feedback and satisfaction surveys, could provide deeper insights into churn drivers and help refine the predictive models.
Cross-Functional Collaboration: Collaborate across marketing, product development, and customer service teams to implement the insights derived from the EDA and predictive models, ensuring that initiatives are aligned and synergistic in reducing churn.
APPENDIX Data Background and Contents
C”sťomcíId: U⭲iq”c IK wkick is assig⭲cd ťo cack c”sťomcíS”í⭲amc: Ḻasť ⭲amc or ťkc c”sťomcíCícdiťScoíc: Iť dcri⭲cs ťkc cícdiť kisťoíQ or ťkc c”sťomcí.GcogíapkQ: A c”sťomcí’s locaťio⭲Gc⭲dcí: Iť dcri⭲cs ťkc Gc⭲dcí or ťkc c”sťomcíAgc: Agc or ťkc c”sťomcíľc⭲”íc: N”mbcí or Qcaís roí wkick ťkc c”sťomcí kas bcc⭲ wiťk ťkc ba⭲kN”mOrPíod”cťs: ícrcís ťo ťkc ⭲”mbcí or píod”cťs ťkať a c”sťomcí kas p”íckascd ťkío”gk ťkc ba⭲k.Bala⭲cc: Acco”⭲ť bala⭲ccHasCíCaíd: Iť is a caťcgoíical :aíiablc wkick dccidcs wkcťkcí ťkc c”sťomcí kas cícdiť caíd oí ⭲oť.EsťimaťcdSalaíQ: Esťimaťcd salaíQisAcťi:cMcmbcí: Is is a caťcgoíical :aíiablc wkick dccidcs wkcťkcí ťkc c”sťomcí is acťi:c mcmbcí or ťkc ba⭲k oí ⭲oť ( Acťi:c mcmbcí i⭲ ťkc sc⭲sc, “si⭲g ba⭲k píod”cťs ícg”laílQ, maki⭲g ťía⭲sacťio⭲s cťc )Exiťcd : wkcťkcí oí ⭲oť ťkc c”sťomcí lcrť ťkc ba⭲k wiťki⭲ six mo⭲ťk. Iť ca⭲ ťakc ťwo :al”cs
0=No ( Customeí did not leave the bank )
1=Yes ( Customeí left the bank )
Happy Learning !58