Primary biliary cholangitis (PBC), previously know as primary biliary cirrhosis, is a chronic inflammatory autoimmune disease of the liver. Characterized by clinical homogeneity and an overwhelming female predominance, the disease progresses to cirrhosis and finally to liver failure over the span of 10-20 years [1]. The prevalence being less than 1/2000, PBC is predominant in postmenopausal females. In fact, it has an estimated female-to-male ratio of 10 to 1 [2]. In the span of ten years between 1974 and 1984, the Mayo Clinic conducted a double-blinded randomized trial containing survival information for 312 patients. The effect of a drug, D-penicillamine (DPCA), on PBC was compared with a placebo. Liver transplantation is considered—to date—as the only effective therapy for PBC. As liver transplant success rate has increased and is now commonly used to treat PBC, the Mayo Clinic data is one of the last containing natural survival data for the disease [3]. In this review, I provide a survival analysis of the Mayo Clinic data, first by evaluating the effect of DPCA and thereafter by using demographic, clinical, biochemical and histologic measurements to create a survival regression model.
A total of 312 patients participated in the randomized trial. An additional 106 cases did not participate, but had basic measurements recorded. Therefore, the complete dataset contains survival information for 418 patients. The observed time variable is considered as the number of days between registration and the earlier of death, transplantation, or study analysis time in July, 1986. The randomized trial data contains the following demographic, clinical, biochemical and histologic measurements:
The parameters marked with a star are also present for the additional 106 patients that did not participate in the randomized trial.
The Kaplan-Meier estimator, often called product limit estimator, is calculated as follows. Survival times \(t_{(1)}, ..., t_{(n)}\) are ordered, \(r_j\) is the number of individuals at risk just before \(t_{(j)}\), and \(d_j\) is the number of individuals experiencing the event at time \(t_{(j)}\). The estimator relies on the following assumptions:
\[\begin{equation} \hat S (t) = \prod_{j:t_{(j)} \leq t} \left( 1 - \frac{d_j}{r_j} \right) \end{equation}\]
The log-rank test, also called Mantel-Cox test is used to carry out a formal hypothesis test to compare two survival distributions. The test is nonparametric, widely used in clinical trials and relies on the same assumptions as the Kaplan-Meier estimator described above [6]. The expected number of deaths for each unique death time in the data is computed and subsequently compared to the observed number of deaths using a Chi-squared test.
In this analysis, the Kaplan-Meier survival function is estimated for both treated and control patients. The patients having undergone liver transplantation are excluded from this estimation. The survival functions are then tested using the log-rank hypothesis test to evaluate the effectiveness of DCPA.
The above mentioned methods—Kaplan-Meier estimator and log-rank test—are for univariate analysis. In contrast, the Cox Proportional-Hazards (Coxph) model [7] is a multivariate analysis tool that creates a regression model for survival time using both quantitative and categorical predictor variables. The model can be useful to asses the effect of potential risk factors on survival time. It can be written as follows, where \(t\) is the survival time, \(h(t)\) is the hazard function determined by \(n\) covariates \((x_1, x_2, ..., x_n)\)—each carrying a coefficient \(\beta_i \forall i \in [1,n]\), and \(h_0\) is the baseline hazard.
\[\begin{equation} h(t) = h_0 (t) \cdot \text{exp}(\beta_1 x_1 + \beta_2 x_2 + ... + \beta_n x_n) \end{equation}\]
In this report, a Cox proportional-hazards model is computed in order to find a predictor function for hazard parameters with regards to PBC.
To extract the main characteristics of the dataset and get a visual representation of the different parameters measured, an exploratory data analysis is performed. The first step is summarizing the data. As can be observed below, the dataset consists of \(89.5\%\) females. Out of 418 observed patients, 25 received liver transplant and 161 passed away during the study. Furthermore, \(71.5\%\) of the studied patients’ PBC was qualified as stage III or IV. The discrete clinical measurement are additionnally summarized and grouped by stage.
Table 1: Summary of demographic variables.
## DAYS AGE SEX DRUG STAGE
## Min. : 41 Min. :26.28 Male : 44 NA :106 NA : 6
## 1st Qu.:1093 1st Qu.:42.83 Female:374 DPCA :158 I : 21
## Median :1730 Median :51.00 Placebo:154 II : 92
## Mean :1918 Mean :50.74 III:155
## 3rd Qu.:2614 3rd Qu.:58.24 IV :144
## Max. :4795 Max. :78.44
## STATUS
## cens.:232
## tx : 25
## death:161
##
##
##
Table 2: Summary of discrete clinical, biochemical and histologic measurements.
| Stage | NA, N = 61 | I, N = 211 | II, N = 921 | III, N = 1551 | IV, N = 1441 |
|---|---|---|---|---|---|
| ASCITES | |||||
| NA | 6 (100%) | 5 (24%) | 25 (27%) | 35 (23%) | 35 (24%) |
| Absence | 0 (0%) | 16 (76%) | 65 (71%) | 119 (77%) | 88 (61%) |
| Presence | 0 (0%) | 0 (0%) | 2 (2.2%) | 1 (0.6%) | 21 (15%) |
| HEPATOM | |||||
| NA | 6 (100%) | 5 (24%) | 25 (27%) | 35 (23%) | 35 (24%) |
| Absence | 0 (0%) | 16 (76%) | 48 (52%) | 67 (43%) | 21 (15%) |
| Presence | 0 (0%) | 0 (0%) | 19 (21%) | 53 (34%) | 88 (61%) |
| SPIDERS | |||||
| NA | 6 (100%) | 5 (24%) | 25 (27%) | 35 (23%) | 35 (24%) |
| Absence | 0 (0%) | 15 (71%) | 58 (63%) | 90 (58%) | 59 (41%) |
| Presence | 0 (0%) | 1 (4.8%) | 9 (9.8%) | 30 (19%) | 50 (35%) |
| EDEMA | |||||
| Absence | 6 (100%) | 20 (95%) | 86 (93%) | 138 (89%) | 104 (72%) |
| Resolved | 0 (0%) | 1 (4.8%) | 5 (5.4%) | 14 (9.0%) | 24 (17%) |
| Presence | 0 (0%) | 0 (0%) | 1 (1.1%) | 3 (1.9%) | 16 (11%) |
|
1
Statistics presented: n (%)
|
|||||
Next, the distribution of age is visualized and grouped by treatment, sex, stage and status. As can be observed in Figure C, severe stage of the disease is distributed more prominently for higher ages. It can also be observed in Figure D that death is more prominent for older patients.
Age distribution grouped by treatment (A), sex (B), stage (C), and status (D).
The distributions of biochemical and histological measurements are visualized.
Combined histograms and boxplots of the following biochemical and histologic measurements: (A) serum cholesterol, (B) albumin, (C) copper, (D) alkaline phosphatase, (E) serum glutamic-oxaloacetic transaminase, (F) triglicerides, (G) platelets, and (H) prothrombin time.
The next step consists of plotting some histological measurements while comparing distributions of censored versus dead patients.
Prothrombin time (A), Serum bilirubin (B), Platelet (C), and Serum albumin (D) distributions grouped by survival status.
The Kaplan-Meier survival function estimation is shown in Figure . It can be observed that treatment does not seem to have a significant effect on survival.
Survival Probability of treated (DCPA) vs. control (Placebo).
The log-rank hypothesis test below validates the visual observation. This results allows us to merge the measurements from treated and non-treated patients in order to define a proportional-hazards model. Such a model can be extremely useful to evaluate the risk factors of different parameters with regards to PBC.
## Call:
## survdiff(formula = sPBCwotx ~ DRUG, data = PBCwotx)
##
## N Observed Expected (O-E)^2/E (O-E)^2/V
## DRUG= 1 148 65 64.4 0.00514 0.00858
## DRUG= 2 245 96 96.6 0.00343 0.00858
##
## Chisq= 0 on 1 degrees of freedom, p= 0.9
Concerning the Cox proportional-hazards model, a first model is calculated using all available parameters. The model is subsequently modified by iteratively removing the variable with the lowest Wald statistic. An intermediate model is obtained when all remaining variables have Wald statistic higher than \(2.5\). The summaries of first and final steps of this procedure can be observed in Table 3.
Table 3: Summary of first and intermediate Cox proportional-hazards models. The model using all available parameters is summarized in the 3 left-first columns and the intermediate model with selected parameters is summarized in the 3 right-most columns.
| Characteristic | First step | Intermediate model | ||||
|---|---|---|---|---|---|---|
| HR1 | 95% CI1 | p-value | HR1 | 95% CI1 | p-value | |
| AGE | 1.04 | 1.01, 1.06 | 0.001 | 1.03 | 1.01, 1.05 | 0.002 |
| BILI | 1.07 | 1.02, 1.12 | 0.005 | 1.11 | 1.07, 1.14 | <0.001 |
| CHOL | 1.00 | 1.00, 1.00 | 0.3 | |||
| ALBUMIN | 0.45 | 0.25, 0.81 | 0.007 | 0.34 | 0.21, 0.54 | <0.001 |
| COPPER | 1.00 | 1.00, 1.01 | 0.005 | 1.00 | 1.00, 1.01 | <0.001 |
| PHOS | 1.00 | 1.00, 1.00 | 0.8 | |||
| SGOT | 1.00 | 1.00, 1.01 | 0.11 | |||
| TRIG | 1.00 | 1.00, 1.00 | >0.9 | |||
| PLATELET | 1.00 | 1.00, 1.00 | 0.6 | |||
| PROTIME | 1.26 | 1.05, 1.51 | 0.013 | 1.29 | 1.11, 1.52 | 0.001 |
| EDEMA | ||||||
| Absence | — | — | — | — | ||
| Resolved | 1.26 | 0.68, 2.37 | 0.5 | 1.10 | 0.63, 1.93 | 0.7 |
| Presence | 2.91 | 1.32, 6.40 | 0.008 | 2.55 | 1.37, 4.74 | 0.003 |
| HEPATOM | ||||||
| Absence | 0.68 | 0.43, 1.07 | 0.10 | |||
| Presence | — | — | ||||
| SPIDERS | ||||||
| Absence | 0.86 | 0.53, 1.39 | 0.5 | |||
| Presence | — | — | ||||
| ASCITES | ||||||
| Absence | 0.86 | 0.40, 1.86 | 0.7 | |||
| Presence | — | — | ||||
|
1
HR = Hazard Ratio, CI = Confidence Interval
|
||||||
As can be seen in Figure , serum bilirubin, albumin, urine copper and prothrombin time have a large amount of outliers. Moreover, the histograms show that for low values of these variables, the impact on survival is high. For these reasons, the next step consists of transforming serum bilirubin, albumin, urine copper and prothrombin time to their logarithmic values. Furthermore, the log-transformation will make these variables more normally distributed, which will increase the power of the model.
The last step consists of making \(log(\text{PROTIME})\) interact with time. In fact, residual testing (c.f. Table 5 and 6) showed that there was a strong correlation between prothrombin time and time. This result was maintained when using the log-transformation. Adding an interaction with time is a common approach to making sure the Cox model assumptions remain valid.
The final model is summarized in Figure . It has a concordance of \(0.852\) and a Wald test of \(201.2\) on 7 degrees of freedom. The model indicates that the risk factors for PBC are the following:
It also indicates that albumin seems to decrease risk. Another observation is that age seems to decrease the risk (\(\text{HR}_{\text{age}} = 0.93\)), which is not expected.
Finally, martingale residuals were tested to assess the model and visualized in Figure . The martingale residuals indicate that the cox proportional-hazards assumptions are valid for this model.
Hazard ratio of Cox Proportional Hazards model.
Table 4: Proportional hazards assumption test for the intermediate Cox regression model fit.
## chisq df p
## AGE 0.265 1 0.60652
## BILI 12.628 1 0.00038
## ALBUMIN 0.429 1 0.51259
## COPPER 0.404 1 0.52518
## PROTIME 4.429 1 0.03533
## EDEMA 4.386 2 0.11157
## GLOBAL 22.926 7 0.00176
Table 5: Proportional hazards assumption test for the final Cox regression model fit.
## chisq df p
## EDEMA 5.4168 2 0.067
## log(ALBUMIN) 0.2065 1 0.650
## log(BILI) 0.8992 1 0.343
## AGE 0.0277 1 0.868
## log(COPPER) 1.2035 1 0.273
## AGE:log(PROTIME) 0.3635 1 0.547
## GLOBAL 14.5331 7 0.042
Martingale residuals of the final model.
[1] R. Poupon, “Primary biliary cirrhosis: A 2010 update,” Journal of Hepatology, vol. 52, no. 5, pp. 745–758, May 2010.
[2] G. M. Hirschfield and M. E. Gershwin, “The Immunobiology and Pathophysiology of Primary Biliary Cirrhosis,” Annual Review of Pathology: Mechanisms of Disease, vol. 8, no. 1, pp. 303–330, Jan. 2013.
[3] T. R. Fleming and D. P. Harrington, Counting Processes and Survival Analysis: Fleming/Counting. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2005.
[4] J. Ludwig, E. R. Dickson, and G. S. A. McDonald, “Staging of chronic nonsuppurative destructive cholangitis (syndrome of primary biliary cirrhosis),” Virchows Archiv A Pathological Anatomy and Histology, vol. 379, no. 2, pp. 103–112, 1978.
[5] B. Efron, “Logistic Regression, Survival Analysis, and the Kaplan-Meier Curve,” Journal of the American Statistical Association, vol. 83, no. 402, pp. 414–425, Jun. 1988.
[6] J. M. Bland and D. G. Altman, “The logrank test,” BMJ, vol. 328, no. 7447, p. 1073, May 2004.
[7] D. R. Cox, “Regression Models and Life-Tables,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 34, no. 2, pp. 187–220, 1972.