Unit 3: Stress of Data
Data Stress and Bias
With the hopes of generating a churn model for my research. I will be acquiring data of customers of a HMR system in South Africa. Not knowing the format of the data yet is very stressful but, I am hopeful that when a client leaves the HMR company asks the leaving client to complete a survey on why they left the company.
My hopes is that the given data is more qualitative dominant as qualitative data in my opinion is better suited to create a churn model.
A churn model is a predictive model that can estimate the susceptibility a customer has to leave. For each customer at any given time, it tells us how high the risk is of losing them in the future (Votava, 2021).
Machine learning works best with simple data. Quantitative data could be used but then I would need to transform the data into variables a machine algorithm can understand. For instance, if I have open written feedback only in the data. I will need to change the data to maybe a point like system service 1-5. If a client wrote along the lines of "horrible service will never work with them again." I would mark that as a 1 for service and that seems to a justified result. However with comment such as "The service was good". Do I give 3 , 4 or 5. This will result in a bias of my model and therefore could tarnish the finished research with bias.
If the data is unsuccessful or not available I may need to send a survey to customers and get permission from the HMR company. All of this is to be done in December.
References
VOTAVA, A. 2021. Churn prediction model [Online]. Towards Data Science: Towards Data
Science. Available: https://towardsdatascience.com/churn-prediction-model-8a3f669cc760
[Accessed 2022].
Comments
Post a Comment