We have detected you are using an outdated browser.

Kindly upgrade your version of Internet Explorer or use another browser like Google Chrome or Mozilla Firefox.

Research: Smooth Tests of Goodness-of-Fit – Modelling HIV Retention

Motivated by HIV retention, a new research presents an application; smooth test of goodness-of-fit under complete and also under right-censoring to time first occurrence of loss to follow-up.

The smooth test applied is an extension of Neyman’s smooth test to probability distribution and a class of hazard functions for the initial distribution of a recurrent failure-time event. The two-parameter Weibull distribution was fitted to an HIV retention data and the fit assessed using a smooth test of goodness-of-fit. The baseline hazard function of time-to-first loss to follow-up, was also estimated using a Block, Borges and Savits (BBS) minimal repair model. Extensive Monte Carlo Simulations were also conducted to compare the power of the smooth test with the power of empirical goodness-of-fit tests and at various percentages of censoring. Results show that the smooth test performed well both under complete data setup and right- censoring.

The research by Strathmore University lecturer, Dr. Collins Odhiambo, Strathmore Institute of Mathematical Sciences was based on extension of Neyman’s work to cover practical HIV setting. Though several authors have theoretically looked at the development of Neyman’s smooth tests, the main contribution of his study is modelling loss to follow-up in HIV retention. This issue has not been given its due share of coverage in the literature.

The research extends methodology proposed by Rayner et al. (2009), Pena (1998a, b) and Kraus (2007a) to cover HIV retention setting. Whereas modelling, is approached using stochastic variates, normalized orthogonal polynomials plays key role as the basis of test statistics.


Why HIV Retention?

HIV/AIDS have consistently been a major challenge in Kenya. The national prevalence is currently estimated to be 6% and there are at least 1.6 million Kenyans living with HIV (PLHIV) with at least 800,000 of PLHIV on antiretroviral therapy (ART).

In practice, the quality of the ART services is measured against the rate of retention of PLHIV on ART. Program data and national survey shows that, the percentage of PLHIV initiated on ART reduces with time progression and that retention rate in ART is higher in the first 12 months (about 92%) and reduced to about 70% at month 60. This data depicts a critical need to establish robust measures to reduce loss to follow-up (LTFU) among PLHIV who are on ART.

With the advent of the United Nation AIDS(UNAIDS) programme on HIV/AIDS targets in 2013 the focus have turned to interventions that quicken elimination of HIV/AIDS at the global, regional, country, province, district and city levels. The strategy popularly known as 90: 90: 90 targets: that by 2020, 90% of people living with HIV know their HIV status, 90% of people who know their status are receiving ART treatment and 90% of people on HIV treatment have a suppressed viral load so that their immune system remains stronger and the likelihood of their infection being passed to others is greatly reduced. This strategy is currently being implemented in Kenya and this research work focuses on one of statistical innovation that hinges to one of the pillars; third 90 i.e. viral suppression. Viral suppression is achievable by retaining patients on ART for long. There are potential benefits whenever a PLHIV’s viral load is reduced to an undetectable level.


Other Areas of practical application:

The research also looked at performance of smooth tests in two-sample problems in cancer survival studies. Most cancer studies that focus on the identification of survival risk factors use models that assume proportional hazards. With over 100 different types of cancer known today, targeted research in this area is particularly important for the ten most prevalent cancers (i.e. lung and bronchus, prostate, breast, colon and rectum, pancreas, liver and intrahepatic bile duct, leukemia, urinary bladder, non-hodgkin lymphoma, brain and other nervous system) in the fight against cancer.

Despite decades of research in cancer, the overall prognosis, recurrences and survival rates are still attracting huge research interest. Much of the research is specific to cancer-type and is beneficial to patients through advanced technologies and cancer treatment protocols. Cancer is a major public health problem and is the second leading cause of death in the United States. Prostate, lung and bronchus, and colorectal cancers account for 44% of all cases in men, with prostate cancer alone accounting for 20% of new diagnoses. For women, the three most commonly diagnosed cancers are breast, lung and bronchus, and colorectum, representing 50% of all cases. Breast cancer alone accounts for 29% of all new cancer diagnoses in women.

The smooth test applied in this context seeks to validate the score statistics under variety of practical settings.


Key Findings

The research explored Neyman’s smooth test idea and its data-driven versions. Results demonstrated how the test can be extended to cover applications in survival analysis and recurrent events. The study was motivated by analyzing significance of retaining patients on ART. Discussions around retaining patients for long to allow provision of long term ART, tracking WHO staging, tracking immunosuppression profiles and evaluation of emergence of medication toxicities was revisited. Due to significant drop-outs, patients may not realize the benefits of ART. More innovation is therefore required for further ART scale-up and improve retention in care. Therefore, understanding the underlying pattern and distribution of LTFU is necessary to making sound interventions that maintain adherence to ART treatment.

Another motivation for this study is the need to validate smooth test under different practical setting in variety of cancer studies. This have been captivated by the 2-sample problem in cancer studies. With many variations in cancer studies, this research however, does not aim to provide an exhaustive performance of smooth tests for proportionality for all types of cancer, but instead it aims to statistically validate its performance in selected eight different practical cancer settings. Ultimately, the author hopes that the issues and features he comment on will result in higher overall standards and quality of oncological research and limit the risk of using invalid models



Because data for missed clinic visits only began to be collected after 2007, they are available on only a small sample of the population and therefore must be interpreted with caution. In addition, it is possible that some individuals lost to follow-up had died, moved or transferred to another health-care provider without informing the clinic, and that this resulted in some outcomes being misclassified. However, individuals were considered lost to follow-up only after the outreach programme had attempted to locate them. The duration between the point of first LTFU and retracing patients back to care may be significantly long. It’s a random time and requires further investigation.



  • The smooth test applied in this study is data-driven and can be adjusted to detect particular alternatives. They are characterized by the following properties:
  • The test statistics are asymptotically distribution free; The asymptotic distribution of the test statistics can be determined under both the null and alternative hypotheses;
  • The smooth test of GOF approach performs better than empirical GOF test when fitting a parametric distribution to time-to-event complete data.
  • Results highlight the need to better understand LTFU of patients initiated on ART. Nuisance parameter estimation can be performed without changing the test statistic and since the tests rely on maximum likelihood techniques, they asymptotically meet the conditions of the Neyman-Pearson lemma against any simple alternative hypothesis.
  • Further studies that address fitting hazard functions in the presence of censored data and determinants of risks to LTFU are required for clarity.


The research generated 3 manuscripts which have been accepted and two published by International Journal of Statistics in Medical ResearchValidation of the Smooth Test of Goodness-of-Fit for Proportional Hazards in Cancer Survival Studies and A Smooth Test of Goodness-of-Fit for the Weibull Distribution: An Application to an HIV Retention Data may be found here.