Funding: This study was supported by National Research Foundation (NRF) of Korea, Korea University, National Research Foundation (NRF) of Korea, and the US National Institutes of Health (Grants RS-2024-00340298, K2201231, 2022M3J6A1063595, 2022R1A2C1008514, and R21DE031879, R01DE031134).
ABSTRACT
This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.
1 Introduction
Partially interval-censored data arise in a variety of medical registries and biomedical studies, including HIV/AIDS and cancer trials, where failure times are precisely observed for some patients but interval-censored for others (Sun 2007; Bogaerts, Komárek, and Lesaffre 2017). Of particular interest in this paper, within the context of the interval-censored setup, are doubly censored (DC) data and partly interval-censored (PIC) data. In addition to a specific number of exact observations (failures/events), DC data are characterized by either left-censoring or right-censoring, while PIC data entail additional interval-censoring. In cases with DC endpoints, the exact event status can only be determined when measurements fall within a specific range. For instance, in HIV/AIDS treatment trials, the efficacy of antiretroviral therapy is often assessed using HIV-1 RNA levels, which are deemed reliable only within a certain measurement limit. Measurements outside of this range are treated as censored, resulting in a DC structure, where the censoring mechanism is administrative or possibly random. Conversely, PIC endpoints may arise when the onset of AIDS or the time to progression-free survival (PFS) in cancer studies can be precisely observed in some patients, while others experience interval-censoring due to periodic hospital visits (Gao, Zeng, and Lin 2017; Pan, Cai, and Wang 2020). In the absence of exact failure/event times, DC and PIC data are reduced to “case-1” and “case-2” interval-censored data, respectively, which have been extensively studied in the survival analysis literature. It is worth noting that the DC sampling scheme under discussion here differs from doubly-interval-censoring (DIC, Kim, De Gruttola, and Lagakos 1993), where both the originating time and the failure time are subject to interval-censoring.
As an example, we analyze a data set derived from a phase III clinical trial focusing on metastatic colorectal cancer (mCRC, Peeters et al. 2010), which was conducted between June 2006 to March 2008, involving a total of 1186 patients. Patients with mCRC underwent initial treatment based on their KRAS status and subsequently received either panitumumab plus fluorouracil, leucovorin, and irinotecan (FOLFIRI) or FOLFIRI alone as a secondary treatment, administered biweekly. Patients demonstrating disease progression at the first evaluation were considered left-censored. Subsequent instances of disease progression at later evaluations were classified as interval-censored. Patients who survived without disease progression at the end of the study were classified as right-censored. Deaths occurring during the study provided exact observation points. As a result, this data set exhibits a blend of DC and PIC data owing to the periodic administration of treatment. See Section 6 for a more detailed description and analysis of this data set.
A variety of statistical models and methods exist to conduct precise inference for partially interval-censored data. Under a single-sample scenario, a nonparametric distribution estimation was rigorously studied based on the self-consistent equation (Turnbull 1974; Gu and Zhang 1993). Hypothesis testing procedures have been developed to compare survival functions with partially or fully interval-censored data in two-sample cases (Pan 2000; Yuen, Shi, and Zhu 2006). For regression analysis, several authors considered a class of semiparametric transformation models using expectation-maximization (EM), or direct maximum likelihood (ML) estimation (Cai and Cheng 2004; Li et al. 2018; Choi and Huang 2021). This class, which includes proportional hazards (PH) and proportional odds (PO) models as special cases, enjoys a statistically efficient likelihood-based inferential framework that yields hazard-based probabilistic interpretation. Although asymptotically efficient, these ML estimation approaches are generally difficult to implement as they may require simultaneous estimation of regression parameters and the nonparametric hazard function. Another modeling approach to this problem is to use an accelerated failure time (AFT) model, which provides a direct evaluation of the association between the event time and covariates. In the context of partial interval-censoring, various methods have been proposed for statistical inference within the AFT modeling framework. These include the Buckley–James method (Choi, Kim, and Choi 2021; Gao, Zeng, and Lin 2017), M-estimation (Zhang and Li 1996; Ren and Gu 1997), kernel-based nonparametric ML estimation (Groeneboom and Hendrickx 2018), and Bayesian methods (Komárek and Lesaffre 2008).
This paper proposes a linear censored quantile regression (CQR) framework (Koenker 2005; Peng and Huang 2008; Wang and Wang 2009; Son et al. 2022) for partially interval-censored data. This approach continues to enjoy its popularity as a desired substitute for classical mean-based models in both theoretical and applied statistics. While mean-based models can solely characterize the central behavior of the data, CQR allows the analyst to investigate the dependence of the complete distributional information of the dependence of the survival time on a set of covariates. In addition to this accountability, this model can be more robust to heterogeneity, outliers, or extreme values by focusing on a couple of informative quantile levels. These attractive features have stimulated many investigators to study various right-CQR methods. Under interval-censoring, a weighted estimating equation approach can be used to fit quantile regression (QR) models (Frumento 2022; Choi et al. 2024). Several authors have proposed quantile estimation procedures for PIC data, drawing on the recursive weighting method (Cai and Cheng 2004; Lin, He, and Portnoy 2012) and martingale processes (Ji et al. 2012). However, these methods typically assume that censoring times, along with failure times, are known, a condition that is often impractical outside of administrative censoring scenarios. For instance, Lin, He, and Portnoy (2012) assumed knowledge of both failure time and censoring times , while Ji et al. (2012) required at least to be known. In contrast, our method is applicable even when only is known, and hence more general. Moreover, one can employ an adaptive quantile loss function for the analysis of case-2 interval-censored data (Zhou, Feng, and Du 2017); however, this approach may experience a significant loss of efficiency because its implementation only utilizes partial information from the quantile order-deterministic cases. A more recent work (Yang, Narisetty, and He 2018) involves parallel estimation algorithms that use data augmentation methods from imputed latent event times to fit multiple quantile estimators.
The most simple and popular weighting scheme to adjust for right-censoring in survival analysis is the so-called inverse probability of censoring weighting (IPCW) method, which was also adapted to CQR through different versions (Bang and Tsiatis 2002; Peng and Fine 2009). In a survival or incomplete data analysis, the inverse-probability weighting method has been widely used as a simple quasi-experimental statistical approach to obtain unbiased results under observational studies. The simplicity of use and its ease of interpretation engendered considerable research in many areas, such as competing risks (Choi, Kang, and Huang 2018; Choi et al. 2022). Our strategy involves reweighting the complete-case data based on the respective probability estimates of their occurrence when interval-censoring is present. To address DC and PIC structures, our IPCW procedure entails estimating nonparametric left-censored survival functions, employing the “backward” Kaplan–Meier (KM) estimator (Gómez, Julià, and Utzet 1994). Then, we propose a weighted quantile loss function for parameter estimation, whose estimator is shown to satisfy strong consistency and asymptotic normality. For variance estimation, we use the induced smoothing technique (Chiou, Kang, and Yan 2015) that approximates the nonsmooth estimating equation with an asymptotically equivalent smooth estimating function. We further discuss an augmented-IPCW (AIPCW) estimation approach to gain more efficiency, and show that the proposed method can also be readily adapted to handle multivariate interval-censored data. We perform comprehensive simulation studies to showcase the novelty of our approach in relation to finite-sample performance. Furthermore, we demonstrate its practical utility by applying it to data obtained from a phase III clinical trial involving mCRC.
The rest of the paper is organized as follows. Section 2 introduces the statistical model, the proposed IPCW estimation procedure, along with asymptotic results of the proposed estimator, and the induced smoothing procedure for variance estimation. While Section 3 presents an augmentation-based estimator for efficiency gain Section 4 extends our framework to multivariate clustered partially interval-censored data. Sections 5 and 6 summarize our simulation study findings, and the illustration using the phase III MCC data set, respectively. Finally, discussion and concluding remarks are presented in Section 7. All technical details are relegated to the Web appendix. R codes to implement our method are available at https://github.com/yejikim1202/ipcwqrPIC.
2 Model and Estimation
2.1 Statistical Model for DC and PIC Data
Suppose that there are random subjects. For the th subject , let be a dependent variable of interest, such as log-transformed survival time, and be a -vector of covariates. The first element of is set to 1 to include the intercept term. Our main objective is to estimate the -dimensional quantile coefficient vector for some in the following linear model:
(1)
where is the random error whose th quantile conditional on equals 0. If the quantile assumption on is replaced by and log-transformed survival time is used, model (1) corresponds to the familiar AFT model (Chiou, Kang, and Yan 2015). The th conditional quantile function of given is defined as , where is the cumulative distribution function of conditional on . Correspondingly, model (1) amounts to assuming
(2)
which suggests a new estimation strategy that differs from conventional mean-based approaches to analyzing survival data. Unlike the traditional Cox PH and AFT models, the CQR model (1) relaxes the proportionality constraint on the hazard, and allows for modeling data heterogeneity by evaluating the covariate effects at any level of . In this paper, we are primarily interested in the CQR modeling of (i) DC, and (ii) PIC data, which can be formulated as follows.
DC Data:
DC data arise when random censoring can occur from either the left or right side, alongside exact observations. Let denote a tuple of exact failure time, left-censoring, and right-censoring variables, respectively, with . Under the DC structure, we can only observe , where is the observed failure time and is the censoring indicator with , , and . Here, we use and . Notice that is observable only when , that is, , otherwise right-censored or left-censored . Due to the fact that , three censoring statuses should be disjoint. If for all subjects (i.e., without any exact observations), DC data reduce to so-called “current status” or “case-1” interval-censored data (Groeneboom and Hendrickx 2018), in which all subjects are either left- or right-censored.
PIC) Data:
Unlike DC data, PIC data are a mixture of exact and interval-censored failure times, where can be observed only when it is not interval-censored. Suppose that is the tightest interval that might contain , that is, if it is interval-censored. Let be the censoring indicator that takes 1 when is observed, and 0, otherwise. The PIC data can be represented as . It can also be summarized as , where and . Hence, can be right-censored at , and left-censored at . When for all subjects, PIC data reduce to the conventional case-2 interval-censored data.
Remark 1.Note that DC data can be translated to PIC data and vice versa. For example, when is left-censored at or right-censored at , it implies , or , respectively. Therefore, left- and right-censoring can also be seen as interval-censoring, if we allow and . Conversely, PIC data can be taken as DC because and can be right- and left-censored by . In fact, DC complements PIC in the sense that, under DC, is observable only when falls in some interval, while, under PIC, is observable only when lies outside some interval. Due to this similarity, a unified estimation approach is applicable for analyzing both DC and PIC data.
Remark 2.Throughout the paper, it is assumed that the visit process that generates the censoring structure is independent of given . To be specific, let us denote a sequence of examination times by that gives rise to the interval for PIC data, where and . Therefore, the choice of depends on , although the joint distribution of is independent of . Conversely, it followed that and for DC data. We assume that the proportion of obtaining exact observations is not negligible, and the joint distribution of is independent of given for censored subjects. We shall express this independence situation as for DC data and for PIC data. This implies that the paired censoring variables do not provide any additional information regarding the distribution of , other than the fact that it is bracketed (Zhang and Heitjan 2006).
2.2 Estimation
Without censoring, one may directly apply the standard estimating technique for QR, which locates as the minimizer of , where is the check loss function, or equivalently the solution to the estimating equation
(3)
To handle the complex interval-censoring problem, we propose to modify the estimating Equation (3) by using an IPCW technique. For the DC data type, let and be the survival function of the right-, and left-censoring variables, and , respectively, given , and and denote their consistent estimates. Similarly, we can define , , , and their estimates for PIC data. Under DC or PIC, we propose to solve the following IPCW estimating function:
(4)
where
(5)
For PIC data, we may define or , since calculation of is needed only when , for which .
Unbiasedness of the weighting schemes in (5) follows easily using a conditioning argument. For DC data, we have
under the independent assumption between and given . Similarly, for PIC data, it can be seen that
Although Equation (4) is monotone, the exact zero-crossing of usually does not exist. Instead, it is equivalent to the gradient of the -type convex function (Peng and Fine 2009)
(6)
where is a sufficiently large value that bounds both and from above for any in the compact parameter space for . Minimization of (6) can be easily implemented using standard software for -type optimization, or the rq() function in the R package quantreg (Koenker 2005). Therefore, we define the proposed IPCW estimator as .
In the rest of the paper, we will focus on DC data for ease of presentation, since almost similar techniques can be employed to analyze PIC endpoints. To solve Equation (4) with DC data, we need and , some reasonable estimates of and , which can be obtained via various methods. For example, if the censoring mechanism depends on a set of discrete covariates, they can be estimated nonparametrically within each data stratum defined by the values of these discrete covariates. In the case that the underlying censoring mechanism involves continuous covariates, we might assume some parametric or semiparametric methods, such as Cox models. See Remark 3 in the following for available nonparametric approaches. In the sequel, we assume (for simplicity) that the unconditional independence between and , such that and may be replaced by simple KM-type estimators, and , respectively. Then, the IPCW estimating Equation (4) for DC data is given by
(7)
The calculation of the KM estimator for the right-censored survival function is straightforward, that is, , where and . However, the nonparametric estimation of the left-censored survivor function is not so simple, and a number of approaches have been proposed (Gómez, Julià, and Utzet 1994). The most cited and intuitive approach is to use the “backward” KM estimator, that is, transform left-censored data into right-censored data by multiplying each datum by , and then using the KM method. On the original scale, the estimator of is then given by , where, denotes a KM estimate based on the left-censored data multiplied by . More specifically, , where . Notice that the conventional KM method can be used to estimate with PIC data, whereas, the backward KM method should be applied for .
Remark 3.In the case that the visit process generating is independent of given , one might use Beran's local KM estimator (Beran 1981), that is,
(8)
and
where is a sequence of nonnegative weights adding up to 1. For example, we can employ the commonly used Nadaraya–Watson-type weight, that is, , where, is a kernel density function and is the bandwidth converging to zero as . By plugging these local KM estimators into the estimating function (7), we can obtain a nonparametric covariate-adjusted IPCW estimator. Another viable alternative is to employ random forest approaches for nonparametric survival prediction (Ishwaran et al. 2008). This recursive partitioning method is effective, computationally feasible, and accommodates the dependence of covariates on censoring, even in higher dimensions.
2.3 Asymptotic Results
This section provides asymptotic results of the proposed IPCW estimator for DC endpoints. Denote the Euclidean norm by , and let for a vector . We first impose the following regularity conditions:
(C1) The joint distribution function of is continuous. There exists , such that . There also exist such that and .
(C2) The covariate is uniformly bounded, that is, .
(C3) (i) The quantile coefficient is Lipschitz continuous for ; (ii) is bounded above uniformly in and , where .
(C4) For some and , , where and . Here, eigmin denotes the minimum eigenvalue of a matrix.
Note that condition (C1) simplifies theoretical arguments and is satisfied in many clinical settings with administrative censoring. Conditions (C2) and (C3) are typical assumptions in many QR methods for the boundedness of covariates, the smoothness of coefficient processes, and the uniform boundedness of the density function . Condition (C4) should be imposed, such that the asymptotic limit of is strictly convex in a neighborhood of for . This condition implies that at any other than is far from its minimum as goes infinity. Thus, this contains not only the identifiability of , but the consistency of . In addition, it should also be noted that condition (C4) holds, when is positive-definite and is bounded below by a positive constant. We then claim the consistency of in Theorem 1.
Theorem 1.Under regularity conditions (C1)–(C4), , assuming model (2) holds for .
To study the asymptotic normality properties of the proposed estimators. we use the counting process and associated martingale theory (Fleming and Harrington 1991). Based on the natural filtration , we define , where . Likewise, we define the reversed filtration and the martingale process , where . The definitions of and are somewhat hypothetical due to their dependence on future information, but standard martingale theory may also be used by reading the data backward in time. The following theorem states the asymptotic normality of .
Theorem 2.Under regularity conditions (C1)–(C4), weakly converges to zero-mean Gaussian process for , with covariance
where the expression for is given in the Appendix.
Detailed proofs of Theorems 1 and 2 and associated lemma are relegated to the Appendix.
2.4 Variance Estimation via Induced Smoothing
For variance estimation, we may use an induced smoothing approach (Chiou, Kang, and Yan 2015; Choi, Kang, and Huang 2018) by approximating the nonsmoothed estimating equation in (7) with an asymptotically equivalent smoothed function. Let be an random vector independent of the data, where denotes the identity matrix, and be a symmetric, positive-definite matrix with . Let and be the cumulative distribution and density function of the standard multivariate normal variable . Since with , which is implied by Theorem 2, we can approximate by , which gives
(9)
The partial derivative of this formulation can be explicitly expressed as
Moreover, can be approximated by
where and . The inferential procedure with induced smoothing proceeds iteratively as follows:
Step 1. Let and .
Step 2. Given and from the th step, update and as
Step 3. and repeat Step 2 until convergence.
Let and denote the smoothed estimators at convergence. Note that the variance estimator is obtained as a byproduct while performing this iterative procedure. Since is consistent for and , may be used as a variance estimator for . In practice, the induced smoothing procedure converges very quickly, and our simulation results confirm that variance estimates are fairly accurate and stable.
3 Augmentation-Based Estimation
The proposed IPCW estimator is generally statistically inefficient because Equation (7) involves only expressions with noncensored data. The only information obtained from the censored observations is in estimating and . This section suggests an AIPCW estimator based on both censored and uncensored data. To implement the AIPCW approach, we need two procedures, (i) positing a working statistical model for and (ii) defining two expectations, and , where denotes the observed data. Let be a posited model for the distribution of , and let be the ML estimator for this model, with as its limit, satisfying . Then, following Robins and Rotnitzky (1992), we define the AIPCW estimator as the solution to the augmented estimating equation
(10)
where and with and . Here, denotes a working model for , .
The advantages of the AIPCW estimator are generally twofold. First, the estimator is consistent (see proof of Theorem 3, presented in Section B of the Supporting Information), when either the censoring distribution does not depend on the covariates, or the posited model for is correct. For this reason, this estimator is often referred to as a doubly-robust (DR) estimator. Second, when the aforementioned conditions for double robustness are met, the AIPCW estimator can have a smaller asymptotic variance than . In order to solve the above augmented estimating equation, we may use the dfsane() function in the R package BB (Varadhan and Gilbert 2010), which is a derivative-free spectral solver for nonlinear systems of equations. To precisely estimate the standard errors, bootstrapping would be the method of choice, which turns out to be computationally expensive for the AIPCW estimator. As earlier, we employ the induced smoothing method for statistical inference.
4 Extension to Multivariate DC Data
We further extend our CQR method to multivariate clustered DC data. Suppose that there are clusters with the th cluster having members, and that the th subject of the th cluster , can distinctly experience an event of interest, that is subject to double-censoring. It is assumed that is relatively small compared to . For the th member of the th cluster, let be the failure, left-censoring and right-censoring time variables in order, and is the corresponding -vector of covariates. As before, it is assumed that the visit process that generates is independent of and . The observed data consist of , where, and is the censoring indicator with , , and .
Suppose that the marginal regression model satisfies
(11)
where is a -vector of unknown regression parameters common to all clusters. Under the working independence assumption, we may obtain the estimator for by solving the following weighted estimating function:
(12)
where and is a known weight to calibrate for the possible informativeness of cluster sizes (Cong, Yin, and Shen 2007; Wang and Zhao 2008).
For the marginal analysis of clustered survival data, we conventionally use , which tends to overweight the large clusters because each individual observation contributes equally to the estimating equation. When cluster sizes are informative to the outcome of interest, we can incorporate the inverse of cluster sizes as a weight in the estimating function, letting, for example, for some , which is also known to improve the efficiency of the resulting estimator (Wang and Zhao 2008). By assuming common censoring distributions independent of covariates, we may put together data across clusters and use the KM method to estimate and because finite cluster sizes preclude consistent estimation of the censoring distributions. We estimate with , where and , and with , where and .
For variance estimation, we again use the induced smoothing approach. Let be the solution to
at convergence. Then, the variance–covariance matrix of the limiting normal distribution of can be approximated by at , where
and
with and . The iterative procedure, described in Section 2.4, can also be used to approximate the variance of .
5 Simulation Results
5.1 Univariate Partially Interval-Censored Data
This section presents extensive simulation results under various partial interval-censoring scenarios to evaluate the finite-sample properties of the proposed IPCW and AIPCW estimators. All simulations here involve two covariates, , where and . The data-generating model is
where , and . The error term follows standard normal, N(0,1), or extreme-value, EV(0,1), distribution and is adjusted by its quantile level and 0.5, satisfying . To create DC data, the left- and right-censoring variables are generated as and + , respectively, where two constants are varied to yield the desired rates of exact, left-censored and right-censored observations approximately as and . To generate PIC data, the censoring time is first simulated from . For each subject, a sequence of examination times is generated as , where is the largest integer that satisfies . The interval that contains is defined as and . To mix exact and interval-censored data, we generate from , where is set to yield approximately 65%, or 75% exact observations of the failure time data as before. The log survival time can be predicted negatively, for which the estimated survival times are strictly positive. Tables 1–4 demonstrate that the observed biases are predominantly negative, implying that our procedure tends to slightly underestimate the regression parameters. However, this tendency is not severe and becomes negligible as the sample size increases. In fact, this pattern is quite common with IPCW-based methods under nonparametric estimation of right-censored data.
TABLE 1.
Simulation results summarizing the finite-sample properties of the proposed IPCW QR estimator at and 0.5, under univariate DC and PIC data, with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times observed is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, ASE = average of standard errors, and CP = 95% coverage probability.
Data
Error
Exact (%)
Par
Bias
SSE
ASE
CP
Bias
SSE
ASE
CP
DC
N(0,1)
75%
0.3
−0.003
0.132
0.142
0.956
−0.009
0.093
0.099
0.947
−0.001
0.173
0.181
0.953
−0.002
0.115
0.127
0.959
0.5
−0.003
0.117
0.145
0.967
−0.006
0.085
0.101
0.965
−0.001
0.159
0.184
0.967
0.000
0.107
0.129
0.979
65%
0.3
−0.005
0.140
0.153
0.959
−0.013
0.101
0.108
0.948
−0.003
0.188
0.194
0.957
−0.008
0.121
0.137
0.959
0.5
−0.004
0.123
0.157
0.978
−0.008
0.088
0.109
0.974
0.000
0.163
0.198
0.974
−0.004
0.112
0.138
0.985
EV(0,1)
75%
0.3
−0.003
0.132
0.137
0.938
−0.002
0.087
0.098
0.956
−0.001
0.162
0.175
0.951
−0.001
0.116
0.125
0.953
0.5
−0.006
0.143
0.165
0.965
−0.001
0.097
0.117
0.972
0.004
0.173
0.210
0.977
0.000
0.128
0.149
0.967
65%
0.3
−0.010
0.140
0.149
0.946
−0.008
0.093
0.106
0.964
−0.006
0.171
0.189
0.957
−0.007
0.127
0.135
0.955
0.5
−0.011
0.155
0.182
0.972
−0.005
0.106
0.130
0.972
−0.002
0.183
0.232
0.976
−0.002
0.138
0.164
0.961
PIC
N(0,1)
75%
0.3
0.027
0.121
0.134
0.958
0.023
0.090
0.093
0.938
−0.040
0.158
0.166
0.945
−0.036
0.113
0.117
0.940
0.5
0.057
0.127
0.142
0.948
0.050
0.092
0.099
0.936
−0.047
0.168
0.176
0.946
−0.042
0.117
0.123
0.944
65%
0.3
0.027
0.133
0.143
0.948
0.020
0.092
0.100
0.935
−0.049
0.163
0.177
0.953
−0.046
0.119
0.125
0.946
0.5
0.060
0.146
0.157
0.943
0.052
0.101
0.107
0.933
−0.054
0.178
0.192
0.945
−0.050
0.128
0.132
0.937
EV(0,1)
75%
0.3
0.024
0.123
0.131
0.948
0.027
0.086
0.092
0.941
−0.039
0.152
0.163
0.954
−0.041
0.107
0.115
0.940
0.5
0.059
0.159
0.167
0.948
0.056
0.106
0.115
0.939
−0.055
0.190
0.208
0.961
−0.057
0.140
0.144
0.934
65%
0.3
0.026
0.137
0.147
0.953
0.029
0.099
0.104
0.943
−0.046
0.169
0.183
0.948
−0.055
0.121
0.129
0.930
0.5
0.068
0.180
0.198
0.958
0.068
0.125
0.133
0.931
−0.067
0.226
0.244
0.957
−0.082
0.157
0.165
0.943
TABLE 2.
Simulation results comparing the finite-sample properties of the IPCW estimator () to the augmented IPCW estimator () at and 0.5, under univariate DC data with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, MSE = mean-squared error, and RE = relative efficiency.
IPCW
AIPCW
Error
Exact (%)
Par
Bias
SSE
MSE
Bias
SSE
MSE
RE
N(0,1)
75%
0.3
−0.003
0.132
0.017
−0.003
0.130
0.017
0.970
−0.001
0.173
0.030
0.000
0.171
0.029
0.977
0.5
−0.003
0.117
0.014
−0.003
0.117
0.014
1.000
−0.001
0.159
0.025
−0.001
0.157
0.025
0.975
65%
0.3
−0.005
0.140
0.020
−0.005
0.139
0.019
0.986
−0.003
0.188
0.035
−0.002
0.185
0.034
0.968
0.5
−0.004
0.123
0.015
−0.005
0.122
0.015
0.984
0.000
0.163
0.027
0.000
0.162
0.026
0.988
EV(0,1)
75%
0.3
−0.003
0.132
0.017
−0.002
0.132
0.017
1.000
−0.001
0.162
0.026
−0.002
0.160
0.026
0.976
0.5
−0.006
0.143
0.020
−0.005
0.141
0.020
0.972
0.004
0.173
0.030
0.002
0.173
0.030
1.000
65%
0.3
−0.010
0.140
0.020
−0.010
0.139
0.019
0.986
−0.006
0.171
0.029
−0.008
0.170
0.029
0.989
0.5
−0.011
0.155
0.024
−0.011
0.153
0.024
0.974
−0.002
0.183
0.033
−0.002
0.184
0.034
1.011
TABLE 3.
Simulation results comparing the finite-sample properties of the proposed IPCW QR estimators for univariate DC data at and 0.5, where the Beran (1981)'s local Kaplan–Meier (“IPCW-KM”) and survival random forests (“IPCW-RF”) methods are used to approximate the left- and right-censoring distributions given covariates.
IPCW-KM
IPCW-RF
Error
Exact (%)
Par
Bias
SSE
MSE
Bias
SSE
MSE
RE
200
N(0,1)
75%
0.3
0.004
0.131
0.017
−0.045
0.142
0.022
1.292
−0.060
0.170
0.033
−0.030
0.181
0.034
1.036
0.5
−0.004
0.120
0.014
−0.049
0.124
0.018
1.233
−0.076
0.160
0.031
−0.031
0.164
0.028
0.888
65%
0.3
0.014
0.142
0.020
−0.039
0.154
0.025
1.240
−0.067
0.185
0.039
−0.027
0.195
0.039
1.001
0.5
0.002
0.126
0.016
−0.060
0.131
0.021
1.307
−0.073
0.170
0.034
−0.038
0.172
0.031
0.906
EV(0,1)
75%
0.3
0.000
0.126
0.016
−0.035
0.131
0.018
1.158
−0.056
0.154
0.027
−0.026
0.156
0.025
0.931
0.5
−0.004
0.141
0.020
−0.057
0.144
0.024
1.205
−0.082
0.172
0.036
−0.039
0.167
0.029
0.810
65%
0.3
0.005
0.138
0.019
−0.045
0.143
0.022
1.179
−0.073
0.167
0.033
−0.034
0.169
0.030
0.895
0.5
−0.001
0.151
0.023
−0.064
0.156
0.028
1.247
−0.082
0.182
0.040
−0.047
0.179
0.034
0.860
400
N(0,1)
75%
0.3
0.007
0.093
0.009
−0.093
0.112
0.021
2.437
−0.053
0.113
0.016
−0.055
0.124
0.018
1.181
0.5
0.002
0.086
0.007
−0.082
0.096
0.016
2.154
−0.069
0.112
0.017
−0.034
0.116
0.015
0.844
65%
0.3
0.014
0.102
0.011
−0.083
0.115
0.020
1.898
−0.064
0.120
0.018
−0.049
0.131
0.020
1.058
0.5
0.009
0.091
0.008
−0.110
0.097
0.022
2.572
−0.067
0.117
0.018
−0.071
0.118
0.019
1.043
EV(0,1)
75%
0.3
0.008
0.084
0.007
−0.062
0.095
0.013
1.807
−0.052
0.112
0.015
−0.028
0.118
0.015
0.965
0.5
0.007
0.098
0.010
−0.082
0.094
0.016
1.612
−0.074
0.130
0.022
−0.039
0.123
0.017
0.744
65%
0.3
0.013
0.089
0.008
−0.079
0.099
0.016
1.983
−0.066
0.121
0.019
−0.046
0.122
0.017
0.895
0.5
0.014
0.101
0.010
−0.099
0.100
0.020
1.904
−0.075
0.140
0.025
−0.067
0.130
0.021
0.848
TABLE 4.
Simulation results comparing the finite-sample properties of the IPCW estimator, corresponding to the unadjusted (weight ), and adjusted (weight ) methods for multivariate DC data at and 0.5, where the numbers of clusters are 50 and 100.
Unadjusted ()
Adjusted ()
Cluster
Error
Exact (%)
Par
Bias
SSE
ASE
CP
Bias
SSE
ASE
CP
N(0,1)
75%
0.3
−0.008
0.221
0.241
0.955
−0.001
0.237
0.239
0.936
−0.034
0.271
0.301
0.962
−0.025
0.294
0.297
0.938
0.5
−0.011
0.261
0.282
0.960
−0.002
0.287
0.279
0.939
−0.040
0.324
0.350
0.973
−0.031
0.343
0.345
0.957
65%
0.3
−0.014
0.246
0.261
0.945
−0.006
0.262
0.260
0.940
−0.036
0.297
0.326
0.965
−0.028
0.319
0.323
0.941
0.5
−0.012
0.300
0.319
0.961
0.000
0.324
0.317
0.947
−0.046
0.368
0.396
0.967
−0.034
0.392
0.392
0.955
EV(0,1)
75%
0.3
−0.023
0.232
0.255
0.955
−0.016
0.246
0.252
0.945
−0.011
0.309
0.321
0.948
−0.004
0.325
0.317
0.934
0.5
−0.033
0.299
0.323
0.950
−0.024
0.320
0.319
0.938
−0.013
0.380
0.406
0.965
−0.003
0.399
0.397
0.946
65%
0.3
−0.024
0.252
0.273
0.953
−0.018
0.267
0.270
0.937
−0.014
0.335
0.344
0.946
−0.003
0.356
0.339
0.932
0.5
−0.032
0.341
0.374
0.943
−0.021
0.361
0.371
0.942
−0.019
0.438
0.469
0.968
−0.009
0.462
0.461
0.957
N(0,1)
75%
0.3
−0.015
0.149
0.168
0.956
−0.008
0.162
0.167
0.943
−0.020
0.200
0.211
0.954
−0.016
0.210
0.210
0.940
0.5
−0.015
0.173
0.194
0.977
−0.005
0.185
0.193
0.952
−0.026
0.227
0.244
0.965
−0.021
0.241
0.242
0.945
65%
0.3
−0.021
0.163
0.182
0.964
−0.014
0.179
0.181
0.946
−0.025
0.221
0.229
0.956
−0.017
0.233
0.227
0.937
0.5
−0.017
0.197
0.218
0.969
−0.010
0.212
0.218
0.949
−0.029
0.264
0.275
0.965
−0.020
0.275
0.273
0.956
EV(0,1)
75%
0.3
−0.019
0.164
0.178
0.959
−0.016
0.179
0.178
0.946
−0.014
0.216
0.226
0.958
−0.010
0.229
0.224
0.941
0.5
−0.019
0.206
0.226
0.961
−0.016
0.221
0.225
0.941
−0.022
0.267
0.283
0.954
−0.016
0.280
0.280
0.941
65%
0.3
−0.023
0.174
0.191
0.957
−0.018
0.191
0.190
0.936
−0.018
0.230
0.242
0.961
−0.015
0.246
0.239
0.930
0.5
−0.021
0.238
0.259
0.961
−0.015
0.257
0.257
0.947
−0.019
0.315
0.326
0.954
−0.012
0.333
0.323
0.931
The simulation results for DC and PIC are summarized in Table 1, which includes empirical bias (Bias), sampling standard error (SSE), an average of standard error estimates (ASE), and coverage probabilities (CP) of the confidence intervals for , based on 1000 random data sets with sample sizes and 400. Overall, the proposed estimator is unbiased, and the standard error estimates from induced smoothing are close to their empirical estimates. The empirical CPs agree well with the nominal level approximated by the normal distribution. The estimated standard errors are slightly larger than the sampling errors, but their gaps appear to decrease as the sample size increases. Next, the performance of the IPCW and AIPCW estimators is compared for DC data. In addition to Bias and SSE, Table 2 presents the mean-squared error (MSE) of IPCW () and AIPCW (), along with their relative efficiency (RE), defined as . We observe that is unbiased and significantly more efficient than . The efficiency gain in this setting is meaningful, though modest, and could potentially increase with the availability of further time-dependent or longitudinal information (Gorfine, Goldberg, and Ritov 2017).
Table 3 reports additional simulation results under univariate DC data when the censoring distributions also involve covariates. We let and + , while other simulation configurations remain the same as before. To account for this covariate-conditional censoring situation, we apply local KM (Beran 1981) and survival random forests (Ishwaran et al. 2008) methods, as mentioned in Remark 2, to approximate and . The corresponding estimators are referred to as IPCW-KM and IPCW-RF, respectively. Overall, both estimators produce virtually unbiased results that are robust to the effect of covariates on censoring distributions. Table 3 also presents RE, defined as the ratio of the MSE of IPCW-RF to that of IPCW-KM. In the present setting, it seems that the IPCW-KM estimator is slightly more efficient than the IPCW-RF estimator. However, if the censoring distributions involve many covariates such that nonparametric kernel-smoothing is not feasible, the random forests method would be a more viable and reliable alternative.
5.2 Multivariate Partially Interval-Censored Data
Next, we present the simulation results under clustered multivariate DC data. We set the number of clusters as either or . The cluster size is determined by , if satisfies for , otherwise we let , where represents the th percentile of . In this setup, the cluster size ranges from 3 to 11 members and the total number of members is about 200 when , and 400 when . The data-generating model is given by for subject in cluster , where , and follows N(0,1) and EV(0,1) distribution, satisfying . As in the first simulation, we let and + . We consider (unadjusted) and (adjusted); the latter approach may calibrate possible informativeness of cluster sizes on event time. Table 4 shows that the cluster size adjustment with could lead to slightly lower biases but a bit more inflated standard errors. When the cluster size is adjusted, and censoring rates are higher, the estimated standard errors are closer to the empirical standard errors, resulting in more stabilized CPs. When cluster sizes are highly informative to time-to-event, letting for some would be beneficial to achieve a more efficient and robust estimation (Wang and Zhao 2008).
6 Application: mCRC Data
In this section, we apply the proposed method to a data set from a multicentered, randomized, phase III mCRC clinical trial (Peeters et al. 2010). This study aimed to investigate the efficacy and safety of second-line panitumumab plus FOLFIRI versus FOLFIRI alone, concerning patients' survival after the failure of initial treatment for mCRC. Panitumumab is a fully human, antiepidermal growth factor receptor, monoclonal antibody that improves PFS in chemotherapy-refractory mCRC. It was often prescribed with FOLFIRI because it does not benefit clinically alone. From June 2006 to March 2008, 1186 patients who failed first-line treatment of mCRC were randomly assigned (1:1) to panitumumab 6.0 mg/kg plus FOLFIRI versus FOLFIRI alone every 2 weeks. The coprimary end points of PFS and overall survival (OS) were independently tested and prospectively analyzed by KRAS status.
Our analysis focused on 855 patients concerning PFS, for whom treatment and KRAS status were available: 428 (50.0%) and 427 (50.0%) patients received FOLFIRI (coded as 0) and panitumumab + FOLFIRI (coded as 1), respectively, while 474 (55.4%) had wild-type (WT) KRAS tumors (coded as 1) and 381 (44.5%) had mutant (MT) KRAS tumors (coded as 0). Eligible patients, aged 18 or older and diagnosed with adenocarcinoma of the colon or rectum, with an Eastern Cooperative Oncology Group (ECOG) performance status of 0, 1, or 2, were included. They had received only one prior chemotherapy regimen for mCRC, with radiographically confirmed disease progression occurring during or within 6 months of the prior first-line chemotherapy. Patients meeting these criteria underwent central analysis of EGFR and biomarkers with approval from an independent ethics committee before any study-related procedures were initiated. Patients in this study were followed for safety for 30 days after the last study drug administration and for survival every 3 months. Due to this nature of data administration, the disease progression-free period in each patient was subject to various types of interval-censoring: 168 (19.6%), 329 (38.5%), and 306 (35.8%) patients were left-censored, interval-censored, and right-censored, respectively. Exact disease progression times were known only for 52 (6.1%) patients.
Since this data set was collected from 185 clinic centers with a range of 1–23 patients in each center, it can be understood as general multivariate PIC data. Figure 1, computed using a modified self-consistency approach for general interval-censored data (Choi, Kim, and Choi 2021), displays the nonparametric PFS curves corresponding to panitumumab + FOLFIRI versus and FOLFIRI alone groups. We observe that panitumumab + FOLFIRI can achieve higher survival rates than FOLFIRI alone during the first year, but the two KM curves become almost identical about 1.25 years into treatment. Previously, a Bayesian evaluation using standard univariate PH model (Pan, Cai, and Wang 2020) (and ignoring cluster effects) revealed that the treatment effect is statistically significant (Coef = –0.215; CI = –0.384, –0.046), while the KRAS status is not significant (Coef = 0.163; CI = –0.006, 0.332).
Nonparametric Kaplan–Meier curves (based on a self-consistency equation) estimating progression-free survival probabilities, for the “panitumumab+FOLFIRI” versus “FOLFIRI” groups in the mCRC data.
By considering the potential correlation within each clinical site, we alternatively fitted the following multivariate CQR model for the log-transformed PFS with two covariates:
via the proposed IPCW approach, with the cluster-size adjustment weights (unadjusted), or (adjusted). As a preliminary analysis, we first applied Cox's PH model, respectively, to the left endpoint () and right endpoint () of observed time intervals to check whether their distributions depend on any covariates. We found that the effects of the two covariates on both and were distinctly significant at the significance level of 0.1. Furthermore, we used Beran (1981)'s local KM estimates to compute the desired individual weights given the covariates. Standard errors in this case were computed via a cluster-wise bootstrapping method with 100 bootstrap samples.
Figure 2 presents the point estimates and 95% Wald-type confidence intervals for two covariates at different quantile levels of , when the cluster size is adjusted or not. Overall, panitumumab + FOLFIRI does not improve PFS significantly at most quantile levels, and also the difference in KRAS status is not statistically significant, with or without the adjustment of the cluster effect. Panitumumab + FOLFIRI appears to be more effective than FOLFIRI alone in controlling disease progression only at low quantile levels. This observation can also be confirmed by the KM plot in Figure 1, which shows that panitumumab + FOLFIRI can improve PFS only for the first year after treatment. Note that the analysis results do not change significantly whether or not the cluster size is adjusted. However, comparisons of cluster size adjustments for treatment (a vs. b), and KRAS status (c vs. d) in Figure 2 reveal that the quantile coefficients are much more smoothly distributed when the cluster size is adjusted. This implies that some heterogeneity may exist across different clinical sites, and the cluster size adjustment would help achieve standardized results.
Estimated QR coefficients (black curves) of treatment and KRAS status, with corresponding 95% CI estimates (light gray dashed curves), when the cluster size is adjusted or unadjusted.
A drawback of the proposed IPCW estimator is that the estimation procedure only utilizes complete survival time data, and the information from censored observations is used to compute the inverse weight but does not effectively contribute to estimation and statistical efficiency. Since the proportion of individuals with exact PFS time is only 6.1% in this data set, the IPCW approach is expected to produce unbiased results but with low statistical precision. This may partly explain why our results for treatment are slightly different from those of the univariate Cox PH analysis. One might consider an augmentation-based estimation method, but with censoring variables that depend on baseline covariates (as in our case), its derivation is too complicated, and practically not feasible. Nevertheless, our CQR procedure is computationally reliable and not much sensitive to the sample size.
7 Discussion
This paper proposes an IPCW-based estimation method for conducting QR on partially interval-censored data, primarily focusing on DC and PIC endpoints. We demonstrate that the nonparametric left-censored survivor function can be estimated with conventional KM approaches by reading survival data backward in time. Furthermore, we develop an augmentation-based estimation and extend the method to accommodate multivariate partially interval-censored data. The proposed methods can easily be implemented with existing computation packages for QR or -type linear programming. Although we restrict our attention to interval-CQR at a single quantile level, the proposed weighting scheme can be immediately applied to other settings with interval-censoring, such as medical costs (Bang and Tsiatis 2002), competing risks (Choi, Kang, and Huang 2018), time-dependent covariates (Gorfine, Goldberg, and Ritov 2017), AFT model (Komárek and Lesaffre 2008; Gao, Zeng, and Lin 2017; Choi, Kim, and Choi 2021), and composite QR (Zou and Yuan 2008), and so forth.
As pointed out by a reviewer, the quantile level () of interest will be determined by investigators. Depending on the nature of a study, one can choose a desired , but quantiles at extreme levels, such as or 0.99, may not be well-estimated unless sample sizes are justifiably large. Even though the data are subject to a certain type of censoring, the underlying distribution function can be well-identified, and the quantile point corresponding to each quantile level can be estimated unless the censoring rate is too heavy, or presence of other complications, such as competing risks, and so forth.
One of the necessary requirements of the IPCW-based approach is a nonnegligible proportion of exact failure time observations, which is also crucial in establishing asymptotic results of the proposed estimator and constructing computational algorithms. This is because the IPCW approach typically compensates for censored subjects by giving more weight to subjects with similar characteristics who are not censored. Thus, the proposed method may not be applied to fully interval-censored or current status data without known failure time data. Our experience is that our QR procedure is not much sensitive to the level of censoring rates, but high censoring rates may lead to loss of statistical precision. In this case, the augmentation-based estimator can perform better than the IPCW estimator in both estimation and statistical inference, but not much significantly. In our data example, the proportion of the “effective” data samples was only 6.1%, and as a result, the IPCW-based estimators are less statistically efficient than Cox PH estimators.
Recently, De Backer, Ghouch, and Van Keilegom (2019) proposed an alternative estimating approach for CQR with an adapted quantile loss function. For right-censored data, they argued that a consistent estimator of the QR coefficient could be obtained by minimizing the following objective function:
(13)
Notice that formulation (13) allows us to extract the information of every observation at hand, even if confronted with incompleteness from right-censoring. Therefore, one could expect that the solution to (13) will be more efficient than the basic IPCW estimator, especially when the censoring proportion is large. In the same spirit, we may construct an alternative quantile loss function for DC data as
(14)
where we leverage the fact that under the independence assumption of , the following equality holds:
for . The theoretical and empirical properties of the new estimator from the adapted quantile loss function (14) deserve further investigation and will be studied in future research.
Acknowledgments
The authors thank the anonymous AE and two reviewers, whose insightful comments led to a substantially improved presentation of the manuscript. The colorectal cancer data were derived based on raw data sets obtained from www.projectdatasphere.org, which is maintained by Project Data Sphere, LLC. Neither Project Data Sphere, LLC nor the owner(s) of any information from the website have contributed to, approved, or are in any way responsible for the contents of this publication. The research of Dr. T. Choi was supported by a grant from the National Research Foundation (NRF) of Korea (RS-2024-00340298). The research of Dr. S. Choi was supported by a Korea University grant (K2201231) and a grant from the National Research Foundation (NRF) of Korea (2022M3J6A1063595, 2022R1A2C1008514). Dr. Bandyopadhyay acknowledges partial funding support from the grants awarded by the US National Institutes of Health (R21DE031879, R01DE031134).
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A: Asymptotic Results
This section provides asymptotic results of the proposed estimator for doubly censored (DC) data. We first impose the following regularity conditions:
(C1) The joint distribution function of is continuous. There exists such that . There also exist such that and .
(C2) The covariate is uniformly bounded, that is, .
(C3) (i) The quantile coefficient is Lipschitz continuous for ; (ii) is bounded above uniformly in and , where .
(C4) For some and , , where and where eigmin denotes the minimum eigenvalue of a matrix.
In the following, we omit in for notation simplicity but bear in mind that coefficients are all -specific. To avoid tail instability, we restrict the possible range of as . We first fix several notations for establishing our asymptotic results. For right-censoring, define , and . Then, we observe the corresponding Martingale process , where . For left-censoring, define , with the corresponding Martingale process , where .
Theorem A.1.Under regularity conditions (C1)–(C4), , assuming model (2) holds for .
Proof.Define and , where and . In the sequel, we use and to denote supremum taken over and , respectively.
First, by condition (C1), for every , we have and . This, allied with condition (C2), implies that
Define . This function class is Donsker and thus Glivenko–Cantelli (van der Vaart and Wellner 1996), because the class of indicator functions is Donsker and three , , and are uniformly bounded. Therefore, from the Glivenko–Cantelli theorem, we have that . Combining these two results, we obtain
(A.1)
Second, note that for any satisfying , is a nondecreasing function in . Then, for , . By the Cauchy–Schwarz inequality and condition (C4),
for some . Since , the last above inequality follows from condition (C4). Therefore, we have .
By using the fact , , and (A.1), we can easily show that
(A.2)
and thus there exists an such that for , . Consequently, for , belongs to the with probability one when is large enough. Moreover, using Taylor expansion of with respect to yields
where is between and and thus be the element of for a large . Therefore, the desired uniform consistency can be derived by applying (A.2) and condition (C4) to the above display.
Lemma A.1.For any positive sequence satisfying ,
Proof.This lemma can be proved by using the results in Alexander (1984) and similar arguments from Theorem 1 of Lai and Ying (1988). Thus, the detailed derivation is omitted. It is noted that there exist and such that
and
This would be proved using the boundedness properties of and from conditions (C2) and (C3).
Theorem A.2.Under regularity conditions (C1)–(C4), weakly converges to zero-mean Gaussian process for with covariance
Proof.From Fleming and Harrington (1991) and Gómez, Julià, and Utzet (1994), we obtain
and
Using similar empirical process arguments in the proof of Theorem 1, it can be easily seen that
where , and
where .
Now, we use for asymptotic equivalence uniformly in . It follows from standard asymptotic arguments that
where with , with and .
We claim that function classes , , and are Donsker. First, given the Lipschitz continuity of implied by condition (C3), we show that is Donsker by applying similar arguments of and using the fact that the permanence of Donsker property in Lipschitz transformation (Theorem 2.10.6 of van der Vaart and Wellner 1996). Note that and are Lipschitz in due to convexity in . The Donsker property of and then follows similarly. Therefore, from the Donsker theorem (Section 2.8.2 of van der Vaart and Wellner 1996), converges weakly to a zero-mean Gaussian process with covariance matrix .
Finally, we can write , where
From Lemma A.1, the uniform consistency of , and the fact that , we observe (I) . Note that for any , and by condition (C2). The above properties and the uniform consistencies of and to and , respectively, imply that is dominated by (I). Taylor expansion of around and the uniform consistency of for give that
where . Given that , this further implies that , where . It then follows that
(A.3)
Weak convergence of can be established, because , and are Donsker classes, and the Donsker property is preserved under addition and subtraction (Theorem 2.10.6 of van der Vaart and Wellner 1996). We have established the asymptotic results for DC data, and extending these results to PIC data is straightforward by simply adjusting the weighting scheme. By referring to (4) and (5), the transformation into in Theorems A.1 and A.2 can support the asymptotic results for PIC data.
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.
This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available in the Supporting Information section.
This article has earned an open data badge “Reproducible Research” for making publicly available the code necessary to reproduce the reported results. The results reported in this article could fully be reproduced.
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
Alexander, K. S.1984. “Probability Inequalities for Empirical Processes and a Law of the Iterated Logarithm.” Annals of Probability12, no. 4: 1041–1067.
Bogaerts, K., A. Komárek, and E. Lesaffre. 2017. Survival Analysis With Interval Censored Data: A Practical Approach With Examples in R, SAS, and BUGS. New York: Chapman and Hall/CRC.
Chiou, S. H., S. Kang, and J. Yan. 2015. “Rank-Based Estimating Equations With General Weight for Accelerated Failure Time Models: An Induced Smoothing Approach.” Statistics in Medicine34, no. 9: 1495–1510.
Choi, S., T. Choi, H. Cho, and D. Bandyopadhyay. 2022. “Weighted Least-Squares Regression With Competing Risks Data.” Statistics in Medicine41, no. 2: 227–241.
Choi, S., and X. Huang. 2021. “Efficient Inferences for Linear Transformation Models With Doubly Censored Data.” Communications in Statistics–Theory and Methods50, no. 9: 2188–2200.
Choi, T., A. K. Kim, and S. Choi. 2021. “Semiparametric Least-Squares Regression With Doubly-Censored Data.” Computational Statistics & Data Analysis164: 107306.
Choi, T., S. Park, H. Cho, and S. Choi. 2024. “Interval-Censored Linear Quantile Regression.” Journal of Computational and Graphical Statistics In Press. https://doi.org/10.1080/10618600.2024.2365740.
Cong, X. J., G. Yin, and Y. Shen. 2007. “Marginal Analysis of Correlated Failure Time Data With Informative Cluster Sizes.” Biometrics63, no. 3: 663–672.
De Backer, M., A. E. Ghouch, and I. Van Keilegom. 2019. “An Adapted Loss Function for Censored Quantile Regression.” Journal of the American Statistical Association114, no. 527: 1126–1137.
Gao, F., D. Zeng, and D.-Y. Lin. 2017. “Semiparametric Estimation of the Accelerated Failure Time Model With Partly Interval-Censored Data.” Biometrics73, no. 4: 1161–1168.
Gómez, G., O. Julià, and F. Utzet. 1994. “Asymptotic Properties of the Left Kaplan-Meier Estimator.” Communications in Statistics–Theory and Methods23, no. 1: 123–135.
Gorfine, M., Y. Goldberg, and Y. Ritov. 2017. “A Quantile Regression Model for Failure-Time Data With Time-Dependent Covariates.” Biostatistics18, no. 1: 132–146.
Gu, M. G., and C.-H. Zhang. 1993. “Asymptotic Properties of Self-Consistent Estimators Based on Doubly Censored Data.” Annals of Statistics21, no. 2: 611–624.
Kim, M. Y., V. G. De Gruttola, and S. W. Lagakos. 1993. “Analyzing Doubly Censored Data With Covariates, With Application to Aids.” Biometrics49, no. 1: 13–22.
Komárek, A., and E. Lesaffre. 2008. “Bayesian Accelerated Failure Time Model With Multivariate Doubly Interval-Censored Data and Flexible Distributional Assumptions.” Journal of the American Statistical Association103, no. 482: 523–533.
Lai, T. L., and Z. Ying. 1988. “Stochastic Integrals of Empirical-Type Processes With Applications to Censored Regression.” Journal of Multivariate Analysis27, no. 2: 334–358.
Li, S., T. Hu, P. Wang, and J. Sun. 2018. “A Class of Semiparametric Transformation Models for Doubly Censored Failure Time Data.” Scandinavian Journal of Statistics45, no. 3: 682–698.
Pan, C., B. Cai, and L. Wang. 2020. “A Bayesian Approach for Analyzing Partly Interval-Censored Data Under the Proportional Hazards Model.” Statistical Methods in Medical Research29, no. 11: 3192–3204.
Peeters, M., T. Price, A. Cervantes, et al. 2010. “Randomized Phase III Study of Panitumumab With Fluorouracil, Leucovorin, and Irinotecan (FOLFIRI) Compared With FOLFIRI Alone as Second-Line Treatment in Patients With Metastatic Colorectal Cancer.” Journal of Clinical Oncology28, no. 31: 4706–4713.
Peng, L., and Y. Huang. 2008. “Survival Analysis With Quantile Regression Models.” Journal of the American Statistical Association103, no. 482: 637–649.
Robins, J. M., and A. Rotnitzky. 1992. “Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers.” In AIDS Epidemiology, 297–331. New York: Springer.
Son, M., T. Choi, S. J. Shin, Y. Jung, and S. Choi. 2022. “Regularized Linear Censored Quantile Regression.” Journal of the Korean Statistical Society51: 589–607.
Turnbull, B. W.1974. “Nonparametric Estimation of a Survivorship Function With Doubly Censored Data.” Journal of the American Statistical Association69, no. 345: 169–173.
Varadhan, R., and P. Gilbert. 2010. “BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function.” Journal of Statistical Software32: 1–26.
Wang, H. J., and L. Wang. 2009. “Locally Weighted Censored Quantile Regression.” Journal of the American Statistical Association104, no. 487: 1117–1128.
Yang, X., N. N. Narisetty, and X. He. 2018. “A New Approach to Censored Quantile Regression Estimation.” Journal of Computational and Graphical Statistics27, no. 2: 417–425.
Zhang, J., and D. F. Heitjan. 2006. “A Simple Local Sensitivity Analysis Tool for Nonignorable Coarsening: Application to Dependent Censoring.” Biometrics62, no. 4: 1260–1268.
Zhou, X., Y. Feng, and X. Du. 2017. “Quantile Regression for Interval Censored Data.” Communications in Statistics–Theory and Methods46, no. 8: 3848–3863.
Please check your email for instructions on resetting your password.
If you do not receive an email within 10 minutes, your email address may not be registered,
and you may need to create a new Wiley Online Library account.
Request Username
Can't sign in? Forgot your username?
Enter your email address below and we will send you your username
If the address matches an existing account you will receive an email with instructions to retrieve your username
The full text of this article hosted at iucr.org is unavailable due to technical difficulties.