Predicting probability of HELB loan defaulters

The Higher Education Loans Board (HELB) relies on repayment of loans in order to afford the financial support it provides to needy Kenyan students. According to its 2013-2018 strategic plan, annual loan recovery nearly doubled from KES 2.5 billion in 2012 to KES 4.9 billion in 2018. However, 70,000 loanees, accounting for 6.8 billion, are in default.
This is the problem Dr Collins Odhiambo and Dr Lucy Muthoni, together with Pauline Nyathira Kamau (whose master’s thesis they co-supervised) set out to tackle. By performing a quantitative analysis of loan applications using data provided in HELB application forms, they attempted to compute the probability of default by applicants.
The main purpose of their study was to identify the major factors that influence student loan default through a model that utilises structured outliers. “A bank, for instance, will evaluate the risks by looking at the applicant’s employment, credit risks, credit history, and property and then determine the likelihood of defaulting. When dealing with students, however, the dynamics are different as it is not easy to immediately determine their risks,” Dr. Odhiambo, a lecturer at Strathmore Institute of Mathematical Sciences, explains.
“HELB does not necessarily look at factors that influence or predict the risks of defaulting. So long as one is a student or a permanent employee, one can access the loan. However, some of the loanees end up defaulting. Typically, in credit risk, the nature of the data is different because some applicants default while others do not. The data usually includes outliers. Those at the extreme get the loan, utilise it and disappear; others repay for a few months and then disappear.”
The trio revisited theoretical distributions used in the analysis of loan defaulters, particularly when outliers are significant, in order to identify the best model for their purposes. The main models they evaluated are the Log-logistic, two-parameter Weibull, logistic, log-normal and Burr distribution.
They thereafter conducted simulations and generated prototypes to test the robustness of the models in different settings. From this analysis, they identified the log-logistic model as the best model for use and applied it to real data from HELB. They were also able to provide insights into the potential and limitations of using the other models.
The results of this study were published in the International Journal of Statistical Distributions and Applications in 2019.
Dr. Odhiambo holds a PhD in Applied Statistics from Strathmore University. His thesis was titled “Smooth Goodness-of-Fit for composite hypothesis with recurrent events models.” He holds a Master of Science in Biometry from University of Nairobi and a Bachelor of Science (Hons) in Applied Statistics from Maseno University, Kenya.