Volume 66, Issue 8 e70001
RESEARCH ARTICLE
Open Access
Open Data

Inverse-Weighted Quantile Regression With Partially Interval-Censored Data

Yeji Kim

Corresponding Author

Yeji Kim

Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, New York, USA

Correspondence: Yeji Kim ([email protected]) | Sangbum Choi ([email protected])

Search for more papers by this author
Taehwa Choi

Taehwa Choi

School of Mathematics, Statistics and Data Science, Sungshin Women's University, Seoul, South Korea

Data Science Center, Sungshin Women's University, Seoul, South Korea

Search for more papers by this author
Seohyeon Park

Seohyeon Park

Department of Statistics, Korea University, Seoul, South Korea

Search for more papers by this author
Sangbum Choi

Corresponding Author

Sangbum Choi

Department of Statistics, Korea University, Seoul, South Korea

Correspondence: Yeji Kim ([email protected]) | Sangbum Choi ([email protected])

Search for more papers by this author
Dipankar Bandyopadhyay

Dipankar Bandyopadhyay

Department of Biostatistics, School of Public Health, Virginia Commonwealth University, Richmond, Virginia, USA

Search for more papers by this author
First published: 14 November 2024
Funding: This study was supported by National Research Foundation (NRF) of Korea, Korea University, National Research Foundation (NRF) of Korea, and the US National Institutes of Health (Grants RS-2024-00340298, K2201231, 2022M3J6A1063595, 2022R1A2C1008514, and R21DE031879, R01DE031134).

ABSTRACT

This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.

1 Introduction

Partially interval-censored data arise in a variety of medical registries and biomedical studies, including HIV/AIDS and cancer trials, where failure times are precisely observed for some patients but interval-censored for others (Sun 2007; Bogaerts, Komárek, and Lesaffre 2017). Of particular interest in this paper, within the context of the interval-censored setup, are doubly censored (DC) data and partly interval-censored (PIC) data. In addition to a specific number of exact observations (failures/events), DC data are characterized by either left-censoring or right-censoring, while PIC data entail additional interval-censoring. In cases with DC endpoints, the exact event status can only be determined when measurements fall within a specific range. For instance, in HIV/AIDS treatment trials, the efficacy of antiretroviral therapy is often assessed using HIV-1 RNA levels, which are deemed reliable only within a certain measurement limit. Measurements outside of this range are treated as censored, resulting in a DC structure, where the censoring mechanism is administrative or possibly random. Conversely, PIC endpoints may arise when the onset of AIDS or the time to progression-free survival (PFS) in cancer studies can be precisely observed in some patients, while others experience interval-censoring due to periodic hospital visits (Gao, Zeng, and Lin 2017; Pan, Cai, and Wang 2020). In the absence of exact failure/event times, DC and PIC data are reduced to “case-1” and “case-2” interval-censored data, respectively, which have been extensively studied in the survival analysis literature. It is worth noting that the DC sampling scheme under discussion here differs from doubly-interval-censoring (DIC, Kim, De Gruttola, and Lagakos 1993), where both the originating time and the failure time are subject to interval-censoring.

As an example, we analyze a data set derived from a phase III clinical trial focusing on metastatic colorectal cancer (mCRC, Peeters et al. 2010), which was conducted between June 2006 to March 2008, involving a total of 1186 patients. Patients with mCRC underwent initial treatment based on their KRAS status and subsequently received either panitumumab plus fluorouracil, leucovorin, and irinotecan (FOLFIRI) or FOLFIRI alone as a secondary treatment, administered biweekly. Patients demonstrating disease progression at the first evaluation were considered left-censored. Subsequent instances of disease progression at later evaluations were classified as interval-censored. Patients who survived without disease progression at the end of the study were classified as right-censored. Deaths occurring during the study provided exact observation points. As a result, this data set exhibits a blend of DC and PIC data owing to the periodic administration of treatment. See Section 6 for a more detailed description and analysis of this data set.

A variety of statistical models and methods exist to conduct precise inference for partially interval-censored data. Under a single-sample scenario, a nonparametric distribution estimation was rigorously studied based on the self-consistent equation (Turnbull 1974; Gu and Zhang 1993). Hypothesis testing procedures have been developed to compare survival functions with partially or fully interval-censored data in two-sample cases (Pan 2000; Yuen, Shi, and Zhu 2006). For regression analysis, several authors considered a class of semiparametric transformation models using expectation-maximization (EM), or direct maximum likelihood (ML) estimation (Cai and Cheng 2004; Li et al. 2018; Choi and Huang 2021). This class, which includes proportional hazards (PH) and proportional odds (PO) models as special cases, enjoys a statistically efficient likelihood-based inferential framework that yields hazard-based probabilistic interpretation. Although asymptotically efficient, these ML estimation approaches are generally difficult to implement as they may require simultaneous estimation of regression parameters and the nonparametric hazard function. Another modeling approach to this problem is to use an accelerated failure time (AFT) model, which provides a direct evaluation of the association between the event time and covariates. In the context of partial interval-censoring, various methods have been proposed for statistical inference within the AFT modeling framework. These include the Buckley–James method (Choi, Kim, and Choi 2021; Gao, Zeng, and Lin 2017), M-estimation (Zhang and Li 1996; Ren and Gu 1997), kernel-based nonparametric ML estimation (Groeneboom and Hendrickx 2018), and Bayesian methods (Komárek and Lesaffre 2008).

This paper proposes a linear censored quantile regression (CQR) framework (Koenker 2005; Peng and Huang 2008; Wang and Wang 2009; Son et al. 2022) for partially interval-censored data. This approach continues to enjoy its popularity as a desired substitute for classical mean-based models in both theoretical and applied statistics. While mean-based models can solely characterize the central behavior of the data, CQR allows the analyst to investigate the dependence of the complete distributional information of the dependence of the survival time on a set of covariates. In addition to this accountability, this model can be more robust to heterogeneity, outliers, or extreme values by focusing on a couple of informative quantile levels. These attractive features have stimulated many investigators to study various right-CQR methods. Under interval-censoring, a weighted estimating equation approach can be used to fit quantile regression (QR) models (Frumento 2022; Choi et al. 2024). Several authors have proposed quantile estimation procedures for PIC data, drawing on the recursive weighting method (Cai and Cheng 2004; Lin, He, and Portnoy 2012) and martingale processes (Ji et al. 2012). However, these methods typically assume that censoring times, along with failure times, are known, a condition that is often impractical outside of administrative censoring scenarios. For instance, Lin, He, and Portnoy (2012) assumed knowledge of both failure time and censoring times ( T , L , R ) $(T, L, R)$ , while Ji et al. (2012) required at least L $L$ to be known. In contrast, our method is applicable even when only min ( max ( T , L ) , R ) $\min (\max (T, L), R)$ is known, and hence more general. Moreover, one can employ an adaptive quantile loss function for the analysis of case-2 interval-censored data (Zhou, Feng, and Du 2017); however, this approach may experience a significant loss of efficiency because its implementation only utilizes partial information from the quantile order-deterministic cases. A more recent work (Yang, Narisetty, and He 2018) involves parallel estimation algorithms that use data augmentation methods from imputed latent event times to fit multiple quantile estimators.

The most simple and popular weighting scheme to adjust for right-censoring in survival analysis is the so-called inverse probability of censoring weighting (IPCW) method, which was also adapted to CQR through different versions (Bang and Tsiatis 2002; Peng and Fine 2009). In a survival or incomplete data analysis, the inverse-probability weighting method has been widely used as a simple quasi-experimental statistical approach to obtain unbiased results under observational studies. The simplicity of use and its ease of interpretation engendered considerable research in many areas, such as competing risks (Choi, Kang, and Huang 2018; Choi et al. 2022). Our strategy involves reweighting the complete-case data based on the respective probability estimates of their occurrence when interval-censoring is present. To address DC and PIC structures, our IPCW procedure entails estimating nonparametric left-censored survival functions, employing the “backward” Kaplan–Meier (KM) estimator (Gómez, Julià, and Utzet 1994). Then, we propose a weighted quantile loss function for parameter estimation, whose estimator is shown to satisfy strong consistency and asymptotic normality. For variance estimation, we use the induced smoothing technique (Chiou, Kang, and Yan 2015) that approximates the nonsmooth estimating equation with an asymptotically equivalent smooth estimating function. We further discuss an augmented-IPCW (AIPCW) estimation approach to gain more efficiency, and show that the proposed method can also be readily adapted to handle multivariate interval-censored data. We perform comprehensive simulation studies to showcase the novelty of our approach in relation to finite-sample performance. Furthermore, we demonstrate its practical utility by applying it to data obtained from a phase III clinical trial involving mCRC.

The rest of the paper is organized as follows. Section 2 introduces the statistical model, the proposed IPCW estimation procedure, along with asymptotic results of the proposed estimator, and the induced smoothing procedure for variance estimation. While Section 3 presents an augmentation-based estimator for efficiency gain Section 4 extends our framework to multivariate clustered partially interval-censored data. Sections 5 and 6 summarize our simulation study findings, and the illustration using the phase III MCC data set, respectively. Finally, discussion and concluding remarks are presented in Section 7. All technical details are relegated to the Web appendix. R codes to implement our method are available at https://github.com/yejikim1202/ipcwqrPIC.

2 Model and Estimation

2.1 Statistical Model for DC and PIC Data

Suppose that there are n $n$ random subjects. For the i $i$ th subject ( i = 1 , , n ) $(i=1,\ldots,n)$ , let T i $T_i$ be a dependent variable of interest, such as log-transformed survival time, and x i ${\bf x}_i$ be a p $p$ -vector of covariates. The first element of x i ${\bf x}_i$ is set to 1 to include the intercept term. Our main objective is to estimate the p $p$ -dimensional quantile coefficient vector β 0 ( τ ) $\bm{\beta }_0(\tau)$ for some τ [ τ L , τ R ] ( 0 , 1 ) $\tau \in [\tau _L,\tau _R]\subset (0, 1)$ in the following linear model:
T i = x i T β 0 ( τ ) + e i ( τ ) , i = 1 , , n , $$\begin{align} T_i = {\bf x}_i^T \bm{\beta }_0(\tau) + e_i(\tau),\quad i=1, \ldots,n, \end{align}$$ (1)
where e i ( τ ) $e_i(\tau)$ is the random error whose τ $\tau$ th quantile conditional on x i ${\bf x}_i$ equals 0. If the quantile assumption on e i ( τ ) $e_i(\tau)$ is replaced by E [ e i ( τ ) ] = 0 $E[e_i(\tau)]=0$ and log-transformed survival time is used, model (1) corresponds to the familiar AFT model (Chiou, Kang, and Yan 2015). The τ $\tau$ th conditional quantile function of T i $T_i$ given x i ${\bf x}_i$ is defined as Q T ( τ | x i ) = inf { t : F ( t | x i ) τ } $Q_{T}(\tau |{\bf x}_i) = \inf \lbrace t: F(t|{\bf x}_i) \ge \tau \rbrace$ , where F ( · | x i ) $F(\cdot |{\bf x}_i)$ is the cumulative distribution function of T i $T_i$ conditional on x i ${\bf x}_i$ . Correspondingly, model (1) amounts to assuming
Q T ( τ | x i ) = x i T β 0 ( τ ) , $$\begin{align} Q_{T}(\tau | {\bf x}_i) = {\bf x}_i^T \bm{\beta }_0(\tau), \end{align}$$ (2)
which suggests a new estimation strategy that differs from conventional mean-based approaches to analyzing survival data. Unlike the traditional Cox PH and AFT models, the CQR model (1) relaxes the proportionality constraint on the hazard, and allows for modeling data heterogeneity by evaluating the covariate effects at any level of τ $\tau$ . In this paper, we are primarily interested in the CQR modeling of (i) DC, and (ii) PIC data, which can be formulated as follows.

DC Data:

DC data arise when random censoring can occur from either the left or right side, alongside exact observations. Let ( T i , L i , R i ) $(T_i, L_i, R_i)$ denote a tuple of exact failure time, left-censoring, and right-censoring variables, respectively, with P ( L i R i ) = 1 $P(L_i \le R_i)=1$ . Under the DC structure, we can only observe { ( T i , δ i , x i ) , i = 1 , , n } $\lbrace (\tilde{T}_i, \delta _i, {\bf x}_i), i=1, \ldots,n\rbrace$ , where T i = ( T i R i ) L i $\tilde{T}_i = (T_i \wedge R_i) \vee L_i$ is the observed failure time and δ i = ( δ 1 i , δ 2 i , δ 3 i ) $\delta _i = (\delta _{1i}, \delta _{2i}, \delta _{3i})$ is the censoring indicator with δ 1 i = I ( L i T i R i ) $\delta _{1i}=I(L_i\le T_i\le R_i)$ , δ 2 i = I ( T i > R i ) $\delta _{2i}=I(T_i> R_i)$ , and δ 3 i = 1 δ 1 i δ 2 i $\delta _{3i}=1-\delta _{1i}-\delta _{2i}$ . Here, we use a b = min ( a , b ) $a \wedge b = \min (a,b)$ and a b = max ( a , b ) $a \vee b = \max (a,b)$ . Notice that T i $T_i$ is observable only when T i ( L i , R i ) $T_i\in (L_i,R_i)$ , that is, δ 1 i = 1 $\delta _{1i}=1$ , otherwise right-censored ( δ 2 i = 1 ) $(\delta _{2i}=1)$ or left-censored ( δ 3 i = 1 ) $(\delta _{3i}=1)$ . Due to the fact that δ 1 i + δ 2 i + δ 3 i = 1 $\delta _{1i} + \delta _{2i} + \delta _{3i}=1$ , three censoring statuses should be disjoint. If δ 1 i 0 $\delta _{1i}\equiv 0$ for all subjects (i.e., without any exact observations), DC data reduce to so-called “current status” or “case-1” interval-censored data (Groeneboom and Hendrickx 2018), in which all subjects are either left- or right-censored.

PIC) Data:

Unlike DC data, PIC data are a mixture of exact and interval-censored failure times, where T i $T_i$ can be observed only when it is not interval-censored. Suppose that ( U i , V i ) $(U_i,V_i)$ is the tightest interval that might contain T i $T_i$ , that is, T i ( U i , V i ) $T_i\in (U_i,V_i)$ if it is interval-censored. Let Δ i $\Delta _i$ be the censoring indicator that takes 1 when T i $T_i$ is observed, and 0, otherwise. The PIC data can be represented as { ( Δ i , Δ i T i , ( 1 Δ i ) U i , ( 1 Δ i ) V i , x i ) , i = 1 , , n } $\lbrace (\Delta _i, \Delta _i T_i, (1-\Delta _i)U_i, (1-\Delta _i)V_i, {\bf x}_i), \nobreakspace i=1, \ldots,n\rbrace$ . It can also be summarized as { ( U i , V i , Δ i , x i ) , i = 1 , , n } $ \lbrace (\tilde{U}_i, \tilde{V}_i, \Delta _i, {\bf x}_i), \nobreakspace i=1, \ldots,n\rbrace$ , where U i = T i U i = Δ i T i + ( 1 Δ i ) U i $\tilde{U}_{i} = T_i \wedge U_i = \Delta _i T_i + (1-\Delta _i) U_i$ and V i = T i V i = Δ i T i + ( 1 Δ i ) V i $\tilde{V}_i = T_i \vee V_i = \Delta _i T_i + (1-\Delta _i) V_i$ . Hence, T i $T_i$ can be right-censored at U i $U_i$ , and left-censored at V i $V_i$ . When Δ i 0 $\Delta _{i}\equiv 0$ for all subjects, PIC data reduce to the conventional case-2 interval-censored data.

Remark 1.Note that DC data can be translated to PIC data and vice versa. For example, when T $T$ is left-censored at L $L$ or right-censored at R $R$ , it implies T ( U , V ) ( , L ) $T \in (U, V) \equiv (-\infty, L)$ , or T ( U , V ) ( R , ) $T \in (U, V) \equiv (R, \infty)$ , respectively. Therefore, left- and right-censoring can also be seen as interval-censoring, if we allow U = $U=-\infty$ and V = $V=\infty$ . Conversely, PIC data can be taken as DC because U $U$ and V $V$ can be right- and left-censored by T $T$ . In fact, DC complements PIC in the sense that, under DC, T $T$ is observable only when T $T$ falls in some interval, while, under PIC, T $T$ is observable only when T $T$ lies outside some interval. Due to this similarity, a unified estimation approach is applicable for analyzing both DC and PIC data.

Remark 2.Throughout the paper, it is assumed that the visit process { W k } $\lbrace W_k\rbrace$ that generates the censoring structure is independent of T $T$ given x ${\bf x}$ . To be specific, let us denote a sequence of examination times by 0 < W 1 < < W K < $0 &lt; W_{1} &lt; \ldots &lt; W_{K} &lt; \infty$ that gives rise to the interval ( U , V ) $(U,V)$ for PIC data, where U = max k { W k : W k T } $U = \max _k\lbrace W_{k}: W_{k} \le T \rbrace$ and V = min k { W k : W k T } $V = \min _k\lbrace W_{k}: W_{k} \ge T\rbrace$ . Therefore, the choice of ( U , V ) $(U,V)$ depends on T $T$ , although the joint distribution of ( W 1 , , W K ) $(W_{1},\ldots,W_{K})$ is independent of T $T$ . Conversely, it followed that L = min k { W k : W k T } $L=\min _k\lbrace W_k:W_k\ge T\rbrace$ and R = max k { W k : W k T } $R=\max _k\lbrace W_k:W_k\le T\rbrace$ for DC data. We assume that the proportion of obtaining exact observations is not negligible, and the joint distribution of ( W 1 , , W K ) $(W_{1},\ldots,W_{K})$ is independent of T $T$ given x ${\bf x}$ for censored subjects. We shall express this independence situation as ( L , R ) T | x $(L,R)\perp \!\!\!\perp T|{\bf x}$ for DC data and ( U , V ) T | x $(U,V)\perp \!\!\!\perp T|{\bf x}$ for PIC data. This implies that the paired censoring variables do not provide any additional information regarding the distribution of T $T$ , other than the fact that it is bracketed (Zhang and Heitjan 2006).

2.2 Estimation

Without censoring, one may directly apply the standard estimating technique for QR, which locates β 0 ( τ ) $\bm{\beta }_0(\tau)$ as the minimizer of n 1 i = 1 n ρ τ ( T i x i T β ) $n^{-1} \sum _{i=1}^n \rho _{\tau } (T_{i} - {\bf x}_i^T \bm{\beta })$ , where ρ τ ( u ) = u { τ I ( u 0 ) } $ \rho _{\tau }(u) = u \lbrace \tau - I(u \le 0) \rbrace $ is the check loss function, or equivalently the solution to the estimating equation
n 1 / 2 i = 1 n x i { I ( T i x i T β 0 ) τ } 0 . $$\begin{align} n^{-1/2} \sum _{i=1}^n {\bf x}_i \lbrace I(T_i -{\bf x}_i^T \bm{\beta }\le 0) - \tau \rbrace \approx 0. \end{align}$$ (3)
To handle the complex interval-censoring problem, we propose to modify the estimating Equation (3) by using an IPCW technique. For the DC data type, let S R ( t | x ) = P ( R t | x ) $S_R(t|{\bf x})=P(R\ge t|{\bf x})$ and S L ( t | x ) = P ( L t | x ) $S_L(t|{\bf x})=P(L\ge t|{\bf x})$ be the survival function of the right-, and left-censoring variables, R $R$ and L $L$ , respectively, given x ${\bf x}$ , and S ̂ R ( t | x ) $\hat{S}_R(t|{\bf x})$ and S ̂ L ( t | x ) $\hat{S}_L(t|{\bf x})$ denote their consistent estimates. Similarly, we can define S U ( t | x ) $S_U(t|{\bf x})$ , S V ( t | x ) $S_V(t|{\bf x})$ , F V ( t | x ) = 1 S V ( t | x ) $F_V(t|{\bf x}) = 1 - S_V(t|{\bf x})$ , and their estimates for PIC data. Under DC or PIC, we propose to solve the following IPCW estimating function:
U n ( β , τ ) = n 1 / 2 i = 1 n x i { w ̂ i I ( T i x i T β 0 ) τ } 0 , $$\begin{align} {\bf U}_{n}(\bm{\beta },\tau) = n^{-1/2} \sum _{i=1}^n {\bf x}_i \lbrace \hat{w}_i I(\tilde{T}_i - {\bf x}_i^T \bm{\beta }\le 0) - \tau \rbrace \approx 0, \end{align}$$ (4)
where
w ̂ i = δ 1 i S ̂ R ( T i | x i ) S ̂ L ( T i | x i ) , for DC data, Δ i F ̂ V ( T i | x i ) + S ̂ U ( T i | x i ) , for PIC data. $$\begin{align} \hat{w}_i = \displaystyle {\begin{cases} \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i|{\bf x}_i) - \hat{S}_{L}(\tilde{T}_i|{\bf x}_i)}, & \mbox{for DC data,} \\[6pt] \dfrac{\Delta _{i}}{\hat{F}_{V}(\tilde{T}_i|{\bf x}_i) + \hat{S}_{U}(\tilde{T}_i|{\bf x}_i)}, & \mbox{for PIC data.} \end{cases}} \end{align}$$ (5)
For PIC data, we may define T i = U i $\tilde{T}_i=\tilde{U}_i$ or T i = V i $\tilde{T}_i=\tilde{V}_i$ , since calculation of w ̂ i $\hat{w}_i$ is needed only when Δ i = 1 $\Delta _i=1$ , for which U i = V i = T i $\tilde{U}_i=\tilde{V}_i=T_i$ .
Unbiasedness of the weighting schemes in (5) follows easily using a conditioning argument. For DC data, we have
E I ( T t , δ 1 = 1 ) S R ( T | x ) S L ( T | x ) | x = E E I ( T t , L < T < R ) S R ( T | x ) S L ( T | x ) | T , x | x = E I ( T t ) { S R ( T | x ) S L ( T | x ) } S R ( T | x ) S L ( T | x ) | x = P ( T t | x ) $$\begin{eqnarray*} && E {\left[ \dfrac{I(\tilde{T}\le t,\delta _1=1)}{S_R(\tilde{T}|{\bf x})-S_L(\tilde{T}|{\bf x})} \bigg | {\bf x}\right]}\\ &&\quad = E {\left[ E{\left\lbrace \dfrac{ I(T\le t, L&lt; T&lt;R)}{S_R(T|{\bf x})-S_L(T|{\bf x})} \bigg |T,{\bf x}\right\rbrace} \bigg |{\bf x}\right]} \\ &&\quad = E {\left[ \dfrac{ I(T\le t)\lbrace S_R(T|{\bf x})-S_L(T|{\bf x})\rbrace }{S_R(T|{\bf x})-S_L(T|{\bf x})} \bigg |{\bf x}\right]} = P(T \le t |{\bf x}) \end{eqnarray*}$$
under the independent assumption between T i $T_i$ and ( L i , R i ) $(L_i,R_i)$ given x ${\bf x}$ . Similarly, for PIC data, it can be seen that
E I ( T t , Δ = 1 ) F V ( T | x ) + S U ( T | x ) | x = E E I ( T t ) { I ( T U ) + I ( T > V ) } 1 S V ( T | x ) + S U ( T | x ) | T , x | x = E I ( T t ) { 1 S V ( T | x ) + S U ( T | x ) } 1 S V ( T | x ) + S U ( T | x ) | x = P ( T t | x ) . $$\begin{align*} & E {\left[ \dfrac{I(\tilde{T}\le t,\Delta =1)}{F_{V}(\tilde{T}|{\bf x})+ S_{U}(\tilde{T}|{\bf x}) } \bigg | {\bf x}\right]} \\ & = E{\left[ E {\left\lbrace \dfrac{I(T\le t)\lbrace I(T\le U)+I(T&gt;V)\rbrace }{1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x}) } \bigg |T,{\bf x}\right\rbrace} \bigg | {\bf x}\right]}\\ &= E{\left[ \dfrac{ I(T\le t)\lbrace 1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x})\rbrace }{1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x}) } \bigg | {\bf x}\right]} = P(T \le t | {\bf x}). \end{align*}$$
Although Equation (4) is monotone, the exact zero-crossing of U n ( β , τ ) ${\bf U}_n(\bm{\beta },\tau)$ usually does not exist. Instead, it is equivalent to the gradient of the l 1 $l_1$ -type convex function (Peng and Fine 2009)
Q n ( β , τ ) = i = 1 n w ̂ i | T i x i T β | + | M j = 1 n ( w ̂ j x j ) T β | + | M ( 2 τ ) k = 1 n x k T β | , $$\begin{eqnarray} Q_{n}(\bm{\beta },\tau) &=& \sum _{i=1}^n \hat{w}_i \big\vert \tilde{T}_i- {\bf x}_i^T \bm{\beta }\big\vert + \Bigg\vert M^* - \sum _{j=1}^n (-\hat{w}_j{\bf x}_j)^T \bm{\beta }\Bigg\vert \nonumber\\ && +\, \Bigg\vert M^*-(2\tau) \sum _{k=1}^n {\bf x}_k^T \bm{\beta }\Bigg\vert , \end{eqnarray}$$ (6)
where M > 0 $M^*&gt;0$ is a sufficiently large value that bounds both | j = 1 n ( w ̂ j x j ) T β | $|\sum _{j=1}^n (-\hat{w}_j{\bf x}_j)^T\bm{\beta }|$ and | ( 2 τ ) k = 1 n x k T β | $|(2\tau)\sum _{k=1}^n {\bf x}_k^T\bm{\beta }|$ from above for any β $\bm{\beta }$ in the compact parameter space B $\mathbb {B}$ for β 0 ( τ ) $\bm{\beta }_0(\tau)$ . Minimization of (6) can be easily implemented using standard software for l 1 $l_1$ -type optimization, or the rq() function in the R package quantreg (Koenker 2005). Therefore, we define the proposed IPCW estimator as β ̂ ( τ ) = arg min β B Q n ( β , τ ) $\hat{\bm{\beta }}(\tau) = \arg \min _{\bm{\beta }\in \mathbb {B}} Q_{n}(\bm{\beta },\tau)$ .
In the rest of the paper, we will focus on DC data for ease of presentation, since almost similar techniques can be employed to analyze PIC endpoints. To solve Equation (4) with DC data, we need S ̂ R ( t | x ) $\hat{S}_R(t|{\bf x})$ and S ̂ L ( t | x ) $\hat{S}_L(t|{\bf x})$ , some reasonable estimates of S R ( t | x ) $S_R(t|{\bf x})$ and S L ( t | x ) $S_L(t|{\bf x})$ , which can be obtained via various methods. For example, if the censoring mechanism depends on a set of discrete covariates, they can be estimated nonparametrically within each data stratum defined by the values of these discrete covariates. In the case that the underlying censoring mechanism involves continuous covariates, we might assume some parametric or semiparametric methods, such as Cox models. See Remark 3 in the following for available nonparametric approaches. In the sequel, we assume (for simplicity) that the unconditional independence between T $T$ and ( L i , R i ) $(L_i,R_i)$ , such that S ̂ R ( t | x ) $\hat{S}_R(t|{\bf x})$ and S ̂ L ( t | x ) $\hat{S}_L(t|{\bf x})$ may be replaced by simple KM-type estimators, S ̂ R ( t ) $\hat{S}_R(t)$ and S ̂ L ( t ) $\hat{S}_L(t)$ , respectively. Then, the IPCW estimating Equation (4) for DC data is given by
U n ( β , τ ) = n 1 / 2 i = 1 n x i δ 1 i S ̂ R ( T i ) S ̂ L ( T i ) I ( T i x i T β 0 ) τ 0 . $$\begin{align} {\bf U}_{n}(\bm{\beta },\tau) = n^{-1/2} \sum _{i=1}^n {\bf x}_i {\left\lbrace \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i)-\hat{S}_{L}(\tilde{T}_i)} I(\tilde{T}_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \!\!\right\rbrace} \approx 0. \end{align}$$ (7)

The calculation of the KM estimator for the right-censored survival function is straightforward, that is, S ̂ R ( t ) = u < t { 1 d N R ( u ) / Y ( u ) } $ \hat{S}_R(t) = \prod _{u&lt;t} \lbrace 1 - dN^R(u)/Y(u)\rbrace $ , where N R ( u ) = i = 1 n N i R ( u ) = i = 1 n I ( T i u , δ 2 i = 1 ) $ N^R(u) = \sum _{i=1}^n N_i^R(u) = \sum _{i=1}^n I(\tilde{T}_i\le u,\delta _{2i}=1)$ and Y ( u ) = i = 1 n Y i ( u ) = i = 1 n I ( T i u ) $ Y(u) = \sum _{i=1}^n Y_i(u) = \sum _{i=1}^n I(\tilde{T}_i\ge u)$ . However, the nonparametric estimation of the left-censored survivor function is not so simple, and a number of approaches have been proposed (Gómez, Julià, and Utzet 1994). The most cited and intuitive approach is to use the “backward” KM estimator, that is, transform left-censored data into right-censored data by multiplying each datum by 1 $-1$ , and then using the KM method. On the original scale, the estimator of S L ( t ) $S_L(t)$ is then given by S ̂ L ( t ) = 1 S ̂ KM ( t ) $ \hat{S}_L(t)=1-\hat{S}_{\it KM}(-t)$ , where, S ̂ KM ( t ) $\hat{S}_{\it KM}(-t)$ denotes a KM estimate based on the left-censored data multiplied by 1 $-1$ . More specifically, S ̂ L ( t ) = 1 u > t { 1 d N L ( u ) / ( n + 1 Y ( u ) ) } $ \hat{S}_L(t) = 1-\prod _{u&gt;t}\lbrace 1-dN^L(u)/(n+1-Y(u))\rbrace $ , where N L ( u ) = i = 1 n N i L ( u ) = i = 1 n I ( T i u , δ 3 i = 1 ) $N^L(u) = \sum _{i=1}^n N_i^L(u) = \sum _{i=1}^n I(\tilde{T}_i\ge u,\delta _{3i}=1)$ . Notice that the conventional KM method can be used to estimate S V ( t ) $S_V(t)$ with PIC data, whereas, the backward KM method should be applied for S U ( t ) $S_U(t)$ .

Remark 3.In the case that the visit process generating ( L i , R i ) $(L_i,R_i)$ is independent of T i $T_i$ given x i ${\bf x}_i$ , one might use Beran's local KM estimator (Beran 1981), that is,

S ̂ R ( t | x ) = j = 1 n 1 B n j ( x ) k = 1 n I ( T k T j ) B n k ( x ) I ( T j t , δ 2 j = 1 ) $$\begin{align} \hat{S}_R(t|{\bf x}) = \prod _{j=1}^n {\left\lbrace 1-\dfrac{B_{nj}({\bf x})}{\sum _{k=1}^n I(\tilde{T}_k \ge \tilde{T}_j)B_{nk}({\bf x})} \right\rbrace} ^{I(\tilde{T}_j \le t,\delta _{2j}=1)} \end{align}$$ (8)
and
S ̂ L ( t | x ) = 1 j = 1 n 1 B n j ( x ) k = 1 n I ( T k T j ) B n k ( x ) I ( T j t , δ 3 j = 1 ) , $$\begin{align*} \hat{S}_L(t|{\bf x}) = 1-\prod _{j=1}^n {\left\lbrace 1-\dfrac{B_{nj}({\bf x})}{\sum _{k=1}^n I(\tilde{T}_k \le \tilde{T}_j)B_{nk}({\bf x})} \right\rbrace} ^{I(\tilde{T}_j\ge t,\delta _{3j}=1)}, \end{align*}$$
where B n j ( x ) $B_{nj}({\bf x})$ is a sequence of nonnegative weights adding up to 1. For example, we can employ the commonly used Nadaraya–Watson-type weight, that is, B n j ( x ) = K x x j h n / k = 1 n K x x k h n $ B_{nj}(x) = K \left(\frac{{\bf x}-{\bf x}_j}{h_n} \right) /\sum _{k=1}^n K \left(\frac{ {\bf x}- {\bf x}_k}{h_n} \right)$ , where, K ( · ) $K(\cdot)$ is a kernel density function and h n R + $h_n \in \mathbb {R}^+$ is the bandwidth converging to zero as n $n \rightarrow \infty$ . By plugging these local KM estimators into the estimating function (7), we can obtain a nonparametric covariate-adjusted IPCW estimator. Another viable alternative is to employ random forest approaches for nonparametric survival prediction (Ishwaran et al. 2008). This recursive partitioning method is effective, computationally feasible, and accommodates the dependence of covariates on censoring, even in higher dimensions.

2.3 Asymptotic Results

This section provides asymptotic results of the proposed IPCW estimator for DC endpoints. Denote the Euclidean norm by · $\Vert \cdot \Vert$ , and let a 2 = aa T ${\bf a}^{\otimes 2} = {\bf aa}^T$ for a vector a ${\bf a}$ . We first impose the following regularity conditions:
  • (C1) The joint distribution function of ( L , R ) $ (L, R)$ is continuous. There exists u ( 0 , ) $ u\in (0,\infty)$ , such that P ( R L > u | x ) = 1 $ P(R-L &gt; u |{\bf x}) =1$ . There also exist < v 1 v 2 v < $ -\infty &lt; v_1 \le v_2 \le v&lt;\infty$ such that P ( v 1 < L v 2 | x ) = 1 $ P(v_1 &lt; L\le v_2|{\bf x}) = 1$ and P ( R v | x ) = 1 $ P(R\le v | {\bf x}) = 1$ .
  • (C2) The covariate x $ {\bf x}$ is uniformly bounded, that is, sup i x i < $ \sup _i \Vert {\bf x}_i \Vert &lt;\infty$ .
  • (C3) (i) The quantile coefficient β 0 ( τ ) $ \bm{\beta }_0(\tau)$ is Lipschitz continuous for τ [ τ L , τ R ] ( 0 , 1 ) $ \tau \in [\tau _L, \tau _R] \subset (0,1)$ ; (ii) f ( t | x ) $ f(t|{\bf x})$ is bounded above uniformly in t $ t$ and x $ {\bf x}$ , where f ( t | x ) = d F ( t | x ) / d t $ f(t|{\bf x}) = dF(t|{\bf x})/dt$ .
  • (C4) For some ρ 0 > 0 $ \rho _0&gt;0$ and c 0 > 0 $ c_0 &gt; 0$ , inf β B ( ρ 0 ) eigmin A { β ( τ ) } c 0 $ \inf _{\bm{\beta }\in \mathbb {B}(\rho _0)}\text{eigmin} \, {\bf A}\lbrace \bm{\beta }(\tau)\rbrace \ge c_0$ , where B ( ρ ) = { β R p : inf τ [ τ L , τ R ] β ( τ ) β 0 ( τ ) ρ } $ \mathbb {B}(\rho) = \lbrace \bm{\beta }\in \mathbb {R}^{p}:\inf _{\tau \in [\tau _L,\tau _R]} \Vert \bm{\beta }(\tau)-\bm{\beta }_0(\tau)\Vert \le \rho \rbrace $ and A { β ( τ ) } = E [ x 2 f ( x T β | x ) ] $ {\bf A}\lbrace \bm{\beta }(\tau)\rbrace = E[{\bf x}^{\otimes 2} f({\bf x}^T\bm{\beta }|{\bf x})]$ . Here, eigmin ( · ) $(\cdot)$ denotes the minimum eigenvalue of a matrix.

Note that condition (C1) simplifies theoretical arguments and is satisfied in many clinical settings with administrative censoring. Conditions (C2) and (C3) are typical assumptions in many QR methods for the boundedness of covariates, the smoothness of coefficient processes, and the uniform boundedness of the density function f ( · ) $f(\cdot)$ . Condition (C4) should be imposed, such that the asymptotic limit of Q n ( β , τ ) $ Q_{n}(\bm{\beta },\tau)$ is strictly convex in a neighborhood of β 0 ( τ ) $\bm{\beta }_0(\tau)$ for τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ . This condition implies that U n ( β , τ ) $ {\bf U}_{n}(\bm{\beta },\tau)$ at any β ( τ ) $ \bm{\beta }(\tau)$ other than β 0 ( τ ) $ \bm{\beta }_0(\tau)$ is far from its minimum as n $ n$ goes infinity. Thus, this contains not only the identifiability of β 0 ( τ ) $ \bm{\beta }_0(\tau)$ , but the consistency of β ̂ ( τ ) $ \hat{\bm{\beta }}(\tau)$ . In addition, it should also be noted that condition (C4) holds, when E ( x 2 ) $ E({\bf x}^{\otimes 2})$ is positive-definite and inf β B ( ρ 0 ) , x f ( x T β | x ) $ \inf _{\bm{\beta }\in \mathbb {B}(\rho _0),{\bf x}}f({\bf x}^T\bm{\beta }|{\bf x})$ is bounded below by a positive constant. We then claim the consistency of β ̂ ( τ ) $ \hat{\bm{\beta }}(\tau)$ in Theorem 1.

Theorem 1.Under regularity conditions (C1)–(C4), lim n sup τ [ τ L , τ R ] β ̂ ( τ ) β 0 ( τ ) p 0 $ \lim _{n\rightarrow \infty } \sup _{\tau \in [\tau _L,\tau _R]} \Vert \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\Vert \rightarrow _p 0$ , assuming model (2) holds for τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ .

To study the asymptotic normality properties of the proposed estimators. we use the counting process and associated martingale theory (Fleming and Harrington 1991). Based on the natural filtration F t R = σ { N i R ( u ) , Y i ( u ) ; u t , i = 1 , , n } $ \mathcal {F}^R_t=\sigma \lbrace N^R_i(u),Y_i(u); u\le t, i=1, \ldots,n\rbrace $ , we define M i R ( t ) = N i R ( t ) t Y i ( u ) λ R ( u ) d u $ M_i^R(t)= N_i^R(t)- \int _{-\infty }^t Y_i(u)\lambda ^R(u)du$ , where λ R ( t ) = lim h 0 P ( u R < u + h | R u ) / h $\lambda ^R(t)=\lim _{h\rightarrow 0}P(u\le R&lt; u+h|R\ge u)/h$ . Likewise, we define the reversed filtration F t L = σ { N i L ( u ) , Y i ( u ) ; u t , i = 1 , , n } $ \mathcal {F}^L_t=\sigma \lbrace N^L_i(u),Y_i(u); u\ge t, i=1, \ldots,n\rbrace $ and the martingale process M i L ( t ) = N i L ( t ) t ( 1 Y i ( u ) ) λ L ( u ) d u $ M_i^L(t)= N_i^L(t)- \int _t^\infty (1-Y_i(u)) \lambda ^L(u)du$ , where λ L ( t ) = lim h 0 P ( u h < L u | L u ) / h $ \lambda ^L(t)=\lim _{h\rightarrow 0}P(u-h&lt;L\le u|L\le u)/h$ . The definitions of F t L $\mathcal {F}^L_t$ and λ L ( t ) $\lambda ^L(t)$ are somewhat hypothetical due to their dependence on future information, but standard martingale theory may also be used by reading the data backward in time. The following theorem states the asymptotic normality of β ̂ ( τ ) $\hat{\bm{\beta }}(\tau)$ .

Theorem 2.Under regularity conditions (C1)–(C4), n 1 / 2 { β ̂ ( τ ) β 0 ( τ ) } $ n^{1/2} \lbrace \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\rbrace $ weakly converges to zero-mean Gaussian process for τ , τ [ τ L , τ R ] $ \tau,\tau ^{\prime }\in [\tau _L,\tau _R]$ , with covariance

Ψ ( τ , τ ) = A { β 0 ( τ ) } 1 E { x i 1 ( τ ) x i 1 ( τ ) T } ( A { β 0 ( τ ) } 1 ) T , $$\begin{equation*} \Psi (\tau,\tau ^{\prime }) = {\bf A}\lbrace \bm{\beta }_0(\tau)\rbrace ^{-1} E\lbrace {\bm xi}_1(\tau) {\bm xi}_1(\tau ^{\prime })^T \rbrace ({\bf A}\lbrace \bm{\beta }_0(\tau ^{\prime })\rbrace ^{-1})^T, \end{equation*}$$
where the expression for x i i ( τ ) ${\bm xi}_i(\tau)$ is given in the Appendix.

Detailed proofs of Theorems 1 and 2 and associated lemma are relegated to the Appendix.

2.4 Variance Estimation via Induced Smoothing

For variance estimation, we may use an induced smoothing approach (Chiou, Kang, and Yan 2015; Choi, Kang, and Huang 2018) by approximating the nonsmoothed estimating equation in (7) with an asymptotically equivalent smoothed function. Let z ${\bf z}$ be an N ( 0 , I p ) $N(0,I_p)$ random vector independent of the data, where I p $I_p$ denotes the p × p $p\times p$ identity matrix, and Σ $\Sigma$ be a p × p $p\times p$ symmetric, positive-definite matrix with Σ = O ( n 1 ) $\Vert \Sigma \Vert =O(n^{-1})$ . Let Φ ( · ) $\Phi (\cdot)$ and ϕ ( · ) $ \phi (\cdot)$ be the cumulative distribution and density function of the standard multivariate normal variable z ${\bf z}$ . Since β ̂ β 0 + Σ 1 / 2 z $\hat{\bm{\beta }}\approx \bm{\beta }_0+\Sigma ^{1/2} {\bf z}$ with Σ = n 1 Ψ $\Sigma =n^{-1}\Psi$ , which is implied by Theorem 2, we can approximate U n ( β , τ ) ${\bf U}_n(\bm{\beta },\tau)$ by U n ( β , Σ , τ ) = E Z [ U n ( β + Σ 1 / 2 z , τ ) ] $\tilde{{\bf U}}_n(\bm{\beta },\Sigma,\tau)=E_Z[{\bf U}_n(\bm{\beta }+\Sigma ^{1/2}{\bf z},\tau)]$ , which gives
U n ( β , Σ , τ ) = n 1 i = 1 n x i { w ̂ i Φ T i x i T β x i T Σ x i τ } . $$\begin{equation} \tilde{{\bf U}}_{n}(\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^{n} {\bf x}_i \Bigg \lbrace \hat{w}_i \Phi {\left(-\dfrac{T_i - {\bf x}_i^T\bm{\beta }}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}} \right)} - \tau \Bigg \rbrace . \end{equation}$$ (9)
The partial derivative of this formulation can be explicitly expressed as
A n ( β , Σ , τ ) = U n ( β , Σ , τ ) β n 1 i = 1 n ϕ T i x i T β x i T Σ x i x i x i T x i T Σ x i . $$\begin{equation*} \tilde{A}_n(\bm{\beta },\Sigma,\tau) = \dfrac{\partial \tilde{{\bf U}}_n(\bm{\beta },\Sigma,\tau)}{\partial \bm{\beta }} \approx n^{-1} \sum _{i=1}^n \phi {\left(-\dfrac{T_i - {\bf x}_i^T\bm{\beta }}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}} \right)} \dfrac{{\bf x}_i{\bf x}_i^T}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}}. \end{equation*}$$
Moreover, Γ ( τ ) lim n var { n 1 / 2 U n ( β 0 , τ ) } $\Gamma (\tau)\equiv \lim _{n\rightarrow \infty }\text{var}\lbrace n^{1/2}{\bf U}_n(\bm{\beta }_0,\tau)\rbrace$ can be approximated by
Γ ̂ n ( β ̂ , τ ) = n 1 i = 1 n x i 2 w ̂ i I ( T i x i T β ̂ ) τ 2 n 1 d N R ( u ) Y 2 ( u ) { B ̂ R ( β ̂ , u ) } 2 + n 1 d N L ( u ) ( n + 1 Y ( u ) ) 2 { B ̂ L ( β ̂ , u ) } 2 , $$\begin{align*} \hat{\Gamma }_n(\hat{\bm{\beta }},\tau) & = n^{-1} \sum _{i=1}^n {\bf x}_i^{\otimes 2} {\left[\hat{w}_i I(\tilde{T}_i\le {\bf x}_i^T\hat{\bm{\beta }}) -\tau \right]}^2 \\ &\quad -\,n^{-1}\int _{-\infty }^\infty \frac{dN^R(u)}{Y^2(u)} \lbrace \hat{B}^R(\hat{\bm{\beta }},u)\rbrace ^{\otimes 2}\\ &\quad +\, n^{-1}\int _{-\infty }^\infty \frac{dN^L(u)}{(n+1-Y(u))^2} \lbrace \hat{B}^L(\hat{\bm{\beta }},u)\rbrace ^{\otimes 2}, \end{align*}$$
where B ̂ R ( β , u ) = i = 1 n w ̂ i Y i ( u ) x i I ( T i x i T β ) $\hat{B}^R(\bm{\beta },u) = \sum _{i=1}^n \hat{w}_i Y_i(u) {\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })$ and B ̂ L ( β , u ) = i = 1 n w ̂ i { 1 Y i ( u ) } x i I ( T i x i T β ) $ \hat{B}^L(\bm{\beta },u) = \sum _{i=1}^n \hat{w}_{i} \lbrace 1-Y_i(u)\rbrace {\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })$ . The inferential procedure with induced smoothing proceeds iteratively as follows:
  • Step 1. Let β ( 0 ) = β ̂ $ \tilde{\bm{\beta }}_{(0)}=\hat{\bm{\beta }}$ and Σ ( 0 ) = n 1 I p $ \tilde{\Sigma }_{(0)} = n^{-1} I_p$ .
  • Step 2. Given β ( k 1 ) $\tilde{\bm{\beta }}_{(k-1)}$ and Σ ( k 1 ) $\tilde{\Sigma }_{(k-1)}$ from the ( k 1 ) $(k-1)$ th step, update β ( k ) $\tilde{\bm{\beta }}_{(k)}$ and Σ ( k ) $\tilde{\Sigma }_{(k)}$ as
    β ( k ) β ( k 1 ) { A n ( β ( k 1 ) , Σ ( k 1 ) , τ ) } 1 U n × ( β ( k 1 ) , Σ ( k 1 ) , τ ) , Σ ( k ) n 1 { A n ( β ( k 1 ) , Σ ( k 1 ) , τ ) } 1 Γ ̂ n ( β ( k 1 ) , τ ) × { A n ( β ( k 1 ) , Σ ( k 1 ) , τ ) } 1 . $$\begin{eqnarray*} && \tilde{\bm{\beta }}_{(k)} \leftarrow \nobreakspace \tilde{\bm{\beta }}_{(k-1)} - \lbrace \tilde{A}_n (\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1} \tilde{{\bf U}}_n\\ && \times \ (\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau), \\ && \tilde{\Sigma }_{(k)} \leftarrow \nobreakspace n^{-1}\lbrace \tilde{A}_n(\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1} \hat{\Gamma }_n (\tilde{\bm{\beta }}_{(k-1)},\tau)\\ && \times \ \lbrace \tilde{A}_n(\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1}. \end{eqnarray*}$$
  • Step 3. k k + 1 $ k\leftarrow k+1$ and repeat Step 2 until convergence.

Let β $\tilde{\bm{\beta }}$ and Σ $\tilde{\Sigma }$ denote the smoothed estimators at convergence. Note that the variance estimator Ψ = n Σ $\tilde{\Psi }=n\tilde{\Sigma }$ is obtained as a byproduct while performing this iterative procedure. Since Ψ $\tilde{\Psi }$ is consistent for Ψ $\Psi$ and β ̂ β = O p ( n 1 / 2 ) $\Vert \hat{\bm{\beta }}-\tilde{\bm{\beta }}\Vert =O_p(n^{-1/2})$ , Ψ $\tilde{\Psi }$ may be used as a variance estimator for β ̂ $\hat{\bm{\beta }}$ . In practice, the induced smoothing procedure converges very quickly, and our simulation results confirm that variance estimates are fairly accurate and stable.

3 Augmentation-Based Estimation

The proposed IPCW estimator is generally statistically inefficient because Equation (7) involves only expressions with noncensored data. The only information obtained from the censored observations is in estimating S L $S_L$ and S R $S_R$ . This section suggests an AIPCW estimator based on both censored and uncensored data. To implement the AIPCW approach, we need two procedures, (i) positing a working statistical model for ( T , x ) $(T,{\bf x})$ and (ii) defining two expectations, Q 1 ( t , β , H i ) = E [ x i { I ( T i x i T β 0 ) τ } | T i t , H i ] $Q_1(t, \bm{\beta }, H_i) = E [ {\bf x}_i \lbrace I (T_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \rbrace \nobreakspace | \nobreakspace T_i \ge t, H_i]$ and Q 2 ( t , β , H i ) = E [ x i { I ( T i x i T β 0 ) τ } | T i t , H i ] $Q_2(t, \bm{\beta }, H_i) = E [ {\bf x}_i \lbrace I (T_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \rbrace \nobreakspace | \nobreakspace T_i \le t, H_i]$ , where H i = ( T i , δ i , x i ) $H_i=(\tilde{T}_i, \delta _i, {\bf x}_i)$ denotes the observed data. Let { p ( h ; ψ ) ; ψ R q } $\lbrace p(h;\psi); \psi \in \mathbb {R}^q\rbrace$ be a posited model for the distribution of H i $H_i$ , and let ψ ̂ $\hat{\psi }$ be the ML estimator for this model, with ψ $\psi ^*$ as its limit, satisfying n 1 / 2 ( ψ ̂ ψ ) = O p ( 1 ) $n^{1/2}(\hat{\psi }-\psi ^*)=O_p(1)$ . Then, following Robins and Rotnitzky (1992), we define the AIPCW estimator β ̂ ( τ ) $\hat{\bm{\beta }}^*(\tau)$ as the solution to the augmented estimating equation
U n ( β , τ ) = n 1 i = 1 n x i δ 1 i S ̂ R ( T i ) S ̂ L ( T i ) I ( T i x i T β 0 ) τ + Q ̂ 1 ( t , β ̂ , ψ ̂ , H i ) d M ̂ i R ( t ) S ̂ R ( t ) + Q ̂ 2 ( t , β ̂ , ψ ̂ , H i ) d M ̂ i L ( t ) S ̂ L ( t ) , $$\begin{equation} \begin{split} {\bf U}^{*}_n(\bm{\beta },\tau) =\, & n^{-1} \sum _{i=1}^n {\left[ {\bf x}_i {\left\lbrace \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i)-\hat{S}_{L}(\tilde{T}_i)} I (\tilde{T}_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \right\rbrace} \right.} \\ & + \int _{-\infty }^{\infty } \hat{Q}_1 (t, \hat{\bm{\beta }}, \hat{\psi }, H_i) \dfrac{d\hat{M}^R_{i}(t)}{ \hat{S}_{R}(t)} \\ &{\left. + \int _{-\infty }^{\infty } \hat{Q}_2 (t, \hat{\bm{\beta }}, \hat{\psi }, H_i) \dfrac{d\hat{M}^L_{i}(t)}{ \hat{S}_L(t)} \right]}, \end{split} \end{equation}$$ (10)
where M ̂ i R ( t ) = N i R ( t ) t Y i ( u ) λ ̂ R ( u ) d u $\hat{M}_i^R(t)= N_i^R(t)- \int _{-\infty }^t Y_i(u)\hat{\lambda }^R(u)du$ and M ̂ i L ( t ) = N i L ( t ) t ( 1 Y i ( u ) ) λ ̂ L ( u ) d u $\hat{M}_i^L(t)= N_i^L(t)- \int _t^\infty (1-Y_i(u)) \hat{\lambda }^L(u)du$ with λ ̂ R ( t ) = d N R ( t ) / Y ( u ) $\hat{\lambda }^R(t)=dN^R(t)/Y(u)$ and λ ̂ L ( t ) = d N L ( t ) / ( n + 1 Y ( u ) ) $\hat{\lambda }^L(t)=dN^L(t)/(n+1-Y(u))$ . Here, Q ̂ k ( t , β , ψ , H i ) $\hat{Q}_k(t,\bm{\beta },\psi,H_i)$ denotes a working model for Q k ( t , β , H i ) $Q_k(t,\bm{\beta },H_i)$ , k = 1 , 2 $k=1,2$ .

The advantages of the AIPCW estimator are generally twofold. First, the estimator is consistent (see proof of Theorem 3, presented in Section B of the Supporting Information), when either the censoring distribution does not depend on the covariates, or the posited model for ( T , x ) $(T,{\bf x})$ is correct. For this reason, this estimator is often referred to as a doubly-robust (DR) estimator. Second, when the aforementioned conditions for double robustness are met, the AIPCW estimator β ̂ ( τ ) $\hat{\bm{\beta }}^*(\tau)$ can have a smaller asymptotic variance than β ̂ ( τ ) $\hat{\bm{\beta }}(\tau)$ . In order to solve the above augmented estimating equation, we may use the dfsane() function in the R package BB (Varadhan and Gilbert 2010), which is a derivative-free spectral solver for nonlinear systems of equations. To precisely estimate the standard errors, bootstrapping would be the method of choice, which turns out to be computationally expensive for the AIPCW estimator. As earlier, we employ the induced smoothing method for statistical inference.

4 Extension to Multivariate DC Data

We further extend our CQR method to multivariate clustered DC data. Suppose that there are n $n$ clusters with the i $i$ th cluster having c i $c_i$ members, and that the k $k$ th subject of the i $i$ th cluster ( k = 1 , 2 , , c i $(k=1,2,\ldots,c_i$ , i = 1 , 2 , , n ) $i=1,2,\ldots,n)$ can distinctly experience an event of interest, that is subject to double-censoring. It is assumed that c i $c_i$ is relatively small compared to n $n$ . For the k $k$ th member of the i $i$ th cluster, let ( T i k , L i k , R i k ) $(T_{ik},L_{ik},R_{ik})$ be the failure, left-censoring and right-censoring time variables in order, and x i k ${\bf x}_{ik}$ is the corresponding p $p$ -vector of covariates. As before, it is assumed that the visit process that generates ( L i k , R i k ) $(L_{ik},R_{ik})$ is independent of T i k $T_{ik}$ and x i k ${\bf x}_{ik}$ . The observed data consist of { ( T i k , δ i k , x i k ) , k = 1 , , c i ; i = 1 , , n } $\lbrace (\tilde{T}_{ik},\delta _{ik},{\bf x}_{ik}),k=1,\ldots,c_i;i=1,\ldots,n\rbrace$ , where, T i k = ( T i k R i k ) L i k $\tilde{T}_{ik}=(T_{ik}\wedge R_{ik})\vee L_{ik}$ and δ i k = ( δ 1 i k , δ 2 i k , δ 3 i k ) $\delta _{ik}=(\delta _{1ik},\delta _{2ik},\delta _{3ik})$ is the censoring indicator with δ 1 i k = I ( L i k T i k R i k ) $\delta _{1ik}=I(L_{ik}\le T_{ik}\le R_{ik})$ , δ 2 i k = I ( T i k > R i k ) $\delta _{2ik}=I(T_{ik}&gt; R_{ik})$ , and δ 3 i k = 1 δ 1 i k δ 2 i k $\delta _{3ik}=1-\delta _{1ik}-\delta _{2ik}$ .

Suppose that the marginal regression model satisfies
T i k = x i k T β ( τ ) + e i k ( τ ) , k = 1 , , c i , i = 1 , , n , $$\begin{equation} T_{ik}={\bf x}_{ik}^T\bm{\beta }(\tau)+e_{ik}(\tau),\nobreakspace \nobreakspace k=1,\ldots,c_i, i=1,\ldots,n, \end{equation}$$ (11)
where β ( τ ) $\bm{\beta }(\tau)$ is a p $p$ -vector of unknown regression parameters common to all n $n$ clusters. Under the working independence assumption, we may obtain the estimator β ̂ ( τ ) $\hat{\bm{\beta }}(\tau)$ for β ( τ ) $\bm{\beta }(\tau)$ by solving the following weighted estimating function:
U n ( β , Σ , τ ) = n 1 i = 1 n η i k = 1 c i x i k w ̂ i k I ( T i k x i k T β 0 ) τ , $$\begin{align} {\bf U}_{n}^\dagger (\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik} {\left\lbrace \hat{w}_{ik} I(\tilde{T}_{ik} -{\bf x}_{ik}^T\bm{\beta }\le 0)-\tau \right\rbrace}, \end{align}$$ (12)
where w ̂ i k = δ 1 i k / { S R ( T i k ) S L ( T i k ) } $\hat{w}_{ik}= \delta _{1ik}/ \lbrace \tilde{S}_{R}(\tilde{T}_{ik})-\tilde{S}_{L}(\tilde{T}_{ik})\rbrace$ and η i $\eta _i$ is a known weight to calibrate for the possible informativeness of cluster sizes (Cong, Yin, and Shen 2007; Wang and Zhao 2008).

For the marginal analysis of clustered survival data, we conventionally use η i = 1 $\eta _i=1$ , which tends to overweight the large clusters because each individual observation contributes equally to the estimating equation. When cluster sizes are informative to the outcome of interest, we can incorporate the inverse of cluster sizes as a weight in the estimating function, letting, for example, η i = 1 / c i α $\eta _i=1/c_i^\alpha$ for some 0 α 1 $0\le \alpha \le 1$ , which is also known to improve the efficiency of the resulting estimator (Wang and Zhao 2008). By assuming common censoring distributions independent of covariates, we may put together data across clusters and use the KM method to estimate S R ( t ) = P ( R i k t ) $S_R(t)=P(R_{ik}\ge t)$ and S L ( t ) = P ( L i k t ) $S_L(t)=P(L_{ik}\ge t)$ because finite cluster sizes preclude consistent estimation of the censoring distributions. We estimate S R ( t ) $S_R(t)$ with S R ( t ) = u < t { 1 d N R ( u ) / Y ( u ) } $\tilde{S}_R(t)=\prod _{u&lt;t}\lbrace 1-d\tilde{N}^R(u)/\tilde{Y}(u)\rbrace$ , where N R ( u ) = i = 1 n η i k = 1 c i N i k R ( u ) = i = 1 n k = 1 c i η i I ( T i k u , δ 2 i k = 1 ) $\tilde{N}^R(u)=\sum _{i=1}^n\eta _i \sum _{k=1}^{c_i}\tilde{N}_{ik}^R(u)=\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _iI(\tilde{T}_{ik}\le u,\delta _{2ik}=1)$ and Y ( u ) = i = 1 n η i k = 1 c i Y i k ( u ) = i = 1 n k = 1 c i η i I ( T i k u ) $\tilde{Y}(u)=\sum _{i=1}^n \eta _i \sum _{k=1}^{c_i}\tilde{Y}_{ik}(u) =\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _i I(\tilde{T}_{ik}\ge u)$ , and S L ( t ) $S_L(t)$ with S L ( t ) = 1 u > t { 1 d N L ( u ) / ( N + 1 Y ( u ) ) } $\tilde{S}_L(t)=1-\prod _{u&gt;t}\lbrace 1-d\tilde{N}^L(u)/(N+1-\tilde{Y}(u))\rbrace$ , where N L ( u ) = i = 1 n η i k = 1 c i N i k L ( u ) = i = 1 n k = 1 c i η i I ( T i k u , δ 3 i k = 1 ) $\tilde{N}^L(u)=\sum _{i=1}^n\eta _i \sum _{k=1}^{c_i}\tilde{N}_{ik}^L(u)=\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _iI(\tilde{T}_{ik}\ge u,\delta _{3ik}=1)$ and N = i = 1 n c i $N=\sum _{i=1}^n c_i$ .

For variance estimation, we again use the induced smoothing approach. Let ( β , Σ ) $(\tilde{\bm{\beta }}^\dagger,\tilde{\Sigma }^\dagger)$ be the solution to
U n ( β , Σ , τ ) = n 1 i = 1 n η i k = 1 c i x i k w ̂ i k Φ T i k x i k T β x i k T Σ x i k τ $$\begin{align*} \tilde{{\bf U}}_{n}^\dagger (\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik} {\left\lbrace \hat{w}_{ik}\Phi {\left(-\dfrac{\tilde{T}_{ik} - {\bf x}_{ik}^T\bm{\beta }}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \right)} - \tau \right\rbrace} \end{align*}$$
at convergence. Then, the variance–covariance matrix of the limiting normal distribution of n 1 / 2 ( β β 0 ) $n^{1/2}(\tilde{\bm{\beta }}^\dagger -\bm{\beta }_0)$ can be approximated by ( A n ) 1 Γ n ( A n ) 1 $(\tilde{A}_{n}^\dagger)^{-1}\tilde{\Gamma }_{n}^\dagger (\tilde{A}_{n}^\dagger)^{-1}$ at ( β , Σ ) $(\tilde{\bm{\beta }}^\dagger,\tilde{\Sigma }^\dagger)$ , where
A ̂ n ( β , Σ , τ ) = U n ( β , Σ , τ ) β n 1 i = 1 n η i k = 1 c i ϕ T i k x i k T β x i k T Σ x i k × x i k x i k T x i k T Σ x i k $$\begin{eqnarray*} \hat{A}_{n}^\dagger (\bm{\beta },\Sigma,\tau) & =& \dfrac{\partial \tilde{{\bf U}}_{n}^\dagger (\bm{\beta },\Sigma,\tau)}{\partial \bm{\beta }} \approx n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} \phi {\left(-\dfrac{\tilde{T}_{ik} - {\bf x}_{ik}^T\bm{\beta }}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \right)} \\ && \times\ \dfrac{{\bf x}_{ik}{\bf x}_{ik}^T}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \end{eqnarray*}$$
and
Γ ̂ n ( β , τ ) = n 1 i = 1 n η i k = 1 c i x i k 2 w ̂ i k I ( T i k x i k T β ) τ 2 n 1 d N R ( u ) Y 2 ( u ) { B R ( β , u ) } 2 + n 1 d N L ( u ) ( N + 1 Y ( u ) ) 2 { B L ( β , u ) } 2 $$\begin{align*} \begin{split} \hat{\Gamma }_n^\dagger (\bm{\beta },\tau) =&\, n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik}^{\otimes 2} {\left[ \hat{w}_{ik}I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta }) -\tau \right]}^2\\ &-n^{-1}\int _{-\infty }^\infty \frac{d\tilde{N}^R(u)}{\tilde{Y}^2(u)} \lbrace \tilde{B}^{R}(\bm{\beta },u)\rbrace ^{\otimes 2}\\ & +\, n^{-1}\int _{-\infty }^\infty \frac{d\tilde{N}^L(u)}{(N+1-\tilde{Y}(u))^2} \lbrace \tilde{B}^{L}(\bm{\beta },u)\rbrace ^{\otimes 2} \end{split} \end{align*}$$
with B R ( β , u ) = i = 1 n k = 1 c i η i w ̂ i k Y i k ( u ) x i k I ( T i k x i k T β ) $\tilde{B}^{R}(\bm{\beta },u) = \sum _{i=1}^n \sum _{k=1}^{c_i} \eta _i \hat{w}_{ik} Y_{ik}(u) {\bf x}_{ik} I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta })$ and B L ( β , u ) = i = 1 n k = 1 c i η i w ̂ i k { 1 Y i k ( u ) } x i k I ( T i k x i k T β ) $ \tilde{B}^{L}(\bm{\beta },u) = \sum _{i=1}^n\sum _{k=1}^{c_i} \eta _i \hat{w}_{ik} \lbrace 1-Y_{ik}(u)\rbrace {\bf x}_{ik} I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta })$ . The iterative procedure, described in Section 2.4, can also be used to approximate the variance of β ̂ $\hat{\bm{\beta }}^\dagger$ .

5 Simulation Results

5.1 Univariate Partially Interval-Censored Data

This section presents extensive simulation results under various partial interval-censoring scenarios to evaluate the finite-sample properties of the proposed IPCW and AIPCW estimators. All simulations here involve two covariates, x = ( x 1 , x 2 ) ${\bf x}= (x_1,x_2)$ , where x 1 U ( 0.7 , 1.5 ) $x_1 \sim U(-0.7, 1.5)$ and x 2 Bernoulli ( 0.5 ) $x_2 \sim \text{Bernoulli}(0.5)$ . The data-generating model is
T = 10 + β 1 ( τ ) x 1 + β 2 ( τ ) x 2 + σ ( x ) ( e ( τ ) q ( τ ) ) , $$\begin{equation*} T = 10 + \beta _1(\tau) x_1 + \beta _2(\tau)x_2 + \sigma ({\bf x})(e(\tau)-q(\tau)), \end{equation*}$$
where β 0 ( τ ) = ( β 10 ( τ ) , β 20 ( τ ) ) T = ( 1 , 1 ) T $\bm{\beta }_0(\tau)=(\beta _{10}(\tau),\beta _{20}(\tau))^T=(1,1)^T$ , and σ ( x ) = 0.8 0.1 x 2 $\sigma ({\bf x})=0.8-0.1x_2$ . The error term e ( τ ) $e(\tau)$ follows standard normal, N(0,1), or extreme-value, EV(0,1), distribution and is adjusted by its quantile level τ = 0.3 $\tau =0.3$ and 0.5, satisfying P ( e ( τ ) < q ( τ ) ) = τ $P(e(\tau) &lt; q(\tau))=\tau$ . To create DC data, the left- and right-censoring variables are generated as L 10 + U ( 4.2 , c L ) $L \sim 10+U(-4.2, c_L)$ and R L $R \sim L$ + U ( 4.1 , c R ) $ U(4.1, c_R)$ , respectively, where two constants ( c L , c R ) $(c_L, c_R)$ are varied to yield the desired rates of exact, left-censored and right-censored observations approximately as ( 75 % , 12.5 % , 12.5 % ) $(75\%, 12.5\%, 12.5\%)$ and ( 65 % , 17.5 % , 17.5 % ) $(65\%, 17.5\%, 17.5\%)$ . To generate PIC data, the censoring time C $C$ is first simulated from e C Uniform ( 30 , 50 ) $e^{C}\sim \text{Uniform}(30,50)$ . For each subject, a sequence of K $K$ examination times ( W 1 , , W K ) $(W_{1},\ldots,W_{K})$ is generated as e W k = e W k 1 + Exp ( 1 ) $e^{W_{k}} = e^{W_{k-1}}+\text{Exp}(1)$ , where K > 0 $K&gt;0$ is the largest integer that satisfies W 0 < W 1 < < W K C $-\infty \equiv W_0&lt;W_1&lt;\cdots &lt;W_K\le C$ . The interval ( U , V ) $(U,V)$ that contains T i $T_i$ is defined as U = max k { W k : W k T } $U=\max _k\lbrace W_{k}: W_{k} \le T\rbrace$ and V = min k { W k : W k T } ) $V=\min _k\lbrace W_{k}: W_{k} \ge T\rbrace)$ . To mix exact and interval-censored data, we generate Δ { 0 , 1 } $\Delta \in \lbrace 0,1\rbrace$ from P ( Δ = 1 | x i ) = p 0 0.1 I ( x 1 < 0.8 ) $P(\Delta =1|{\bf x}_i)=p_0-0.1I(x_{1}&lt;0.8)$ , where p 0 ( 0.1 , 1 ) $p_0\in (0.1,1)$ is set to yield approximately 65%, or 75% exact observations of the failure time data as before. The log survival time can be predicted negatively, for which the estimated survival times are strictly positive. Tables 1–4 demonstrate that the observed biases are predominantly negative, implying that our procedure tends to slightly underestimate the regression parameters. However, this tendency is not severe and becomes negligible as the sample size increases. In fact, this pattern is quite common with IPCW-based methods under nonparametric estimation of right-censored data.
TABLE 1. Simulation results summarizing the finite-sample properties of the proposed IPCW QR estimator β ̂ ( τ ) $\hat{\bm{\beta }}(\tau)$ at τ = 0.3 $\tau =0.3$ and 0.5, under univariate DC and PIC data, with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times observed is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, ASE = average of standard errors, and CP = 95% coverage probability.
n = 200 $n=200$ n = 400 $n = 400$
Data Error Exact (%) τ $\tau$ Par Bias SSE ASE CP Bias SSE ASE CP
DC N(0,1) 75% 0.3 β 1 $\beta _1$ −0.003 0.132 0.142 0.956 −0.009 0.093 0.099 0.947
β 2 $\beta _2$ −0.001 0.173 0.181 0.953 −0.002 0.115 0.127 0.959
0.5 β 1 $\beta _1$ −0.003 0.117 0.145 0.967 −0.006 0.085 0.101 0.965
β 2 $\beta _2$ −0.001 0.159 0.184 0.967 0.000 0.107 0.129 0.979
65% 0.3 β 1 $\beta _1$ −0.005 0.140 0.153 0.959 −0.013 0.101 0.108 0.948
β 2 $\beta _2$ −0.003 0.188 0.194 0.957 −0.008 0.121 0.137 0.959
0.5 β 1 $\beta _1$ −0.004 0.123 0.157 0.978 −0.008 0.088 0.109 0.974
β 2 $\beta _2$ 0.000 0.163 0.198 0.974 −0.004 0.112 0.138 0.985
EV(0,1) 75% 0.3 β 1 $\beta _1$ −0.003 0.132 0.137 0.938 −0.002 0.087 0.098 0.956
β 2 $\beta _2$ −0.001 0.162 0.175 0.951 −0.001 0.116 0.125 0.953
0.5 β 1 $\beta _1$ −0.006 0.143 0.165 0.965 −0.001 0.097 0.117 0.972
β 2 $\beta _2$ 0.004 0.173 0.210 0.977 0.000 0.128 0.149 0.967
65% 0.3 β 1 $\beta _1$ −0.010 0.140 0.149 0.946 −0.008 0.093 0.106 0.964
β 2 $\beta _2$ −0.006 0.171 0.189 0.957 −0.007 0.127 0.135 0.955
0.5 β 1 $\beta _1$ −0.011 0.155 0.182 0.972 −0.005 0.106 0.130 0.972
β 2 $\beta _2$ −0.002 0.183 0.232 0.976 −0.002 0.138 0.164 0.961
PIC N(0,1) 75% 0.3 β 1 $\beta _1$ 0.027 0.121 0.134 0.958 0.023 0.090 0.093 0.938
β 2 $\beta _2$ −0.040 0.158 0.166 0.945 −0.036 0.113 0.117 0.940
0.5 β 1 $\beta _1$ 0.057 0.127 0.142 0.948 0.050 0.092 0.099 0.936
β 2 $\beta _2$ −0.047 0.168 0.176 0.946 −0.042 0.117 0.123 0.944
65% 0.3 β 1 $\beta _1$ 0.027 0.133 0.143 0.948 0.020 0.092 0.100 0.935
β 2 $\beta _2$ −0.049 0.163 0.177 0.953 −0.046 0.119 0.125 0.946
0.5 β 1 $\beta _1$ 0.060 0.146 0.157 0.943 0.052 0.101 0.107 0.933
β 2 $\beta _2$ −0.054 0.178 0.192 0.945 −0.050 0.128 0.132 0.937
EV(0,1) 75% 0.3 β 1 $\beta _1$ 0.024 0.123 0.131 0.948 0.027 0.086 0.092 0.941
β 2 $\beta _2$ −0.039 0.152 0.163 0.954 −0.041 0.107 0.115 0.940
0.5 β 1 $\beta _1$ 0.059 0.159 0.167 0.948 0.056 0.106 0.115 0.939
β 2 $\beta _2$ −0.055 0.190 0.208 0.961 −0.057 0.140 0.144 0.934
65% 0.3 β 1 $\beta _1$ 0.026 0.137 0.147 0.953 0.029 0.099 0.104 0.943
β 2 $\beta _2$ −0.046 0.169 0.183 0.948 −0.055 0.121 0.129 0.930
0.5 β 1 $\beta _1$ 0.068 0.180 0.198 0.958 0.068 0.125 0.133 0.931
β 2 $\beta _2$ −0.067 0.226 0.244 0.957 −0.082 0.157 0.165 0.943
TABLE 2. Simulation results comparing the finite-sample properties of the IPCW estimator ( β ̂ $\hat{\bm{\beta }}$ ) to the augmented IPCW estimator ( β ̂ $\hat{\bm{\beta }}^*$ ) at τ = 0.3 $\tau =0.3$ and 0.5, under univariate DC data with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, MSE = mean-squared error, and RE = relative efficiency.
IPCW AIPCW
Error Exact (%) τ $\tau$ Par Bias SSE MSE Bias SSE MSE RE
N(0,1) 75% 0.3 β 1 $\beta _1$ −0.003 0.132 0.017 −0.003 0.130 0.017 0.970
β 2 $\beta _2$ −0.001 0.173 0.030 0.000 0.171 0.029 0.977
0.5 β 1 $\beta _1$ −0.003 0.117 0.014 −0.003 0.117 0.014 1.000
β 2 $\beta _2$ −0.001 0.159 0.025 −0.001 0.157 0.025 0.975
65% 0.3 β 1 $\beta _1$ −0.005 0.140 0.020 −0.005 0.139 0.019 0.986
β 2 $\beta _2$ −0.003 0.188 0.035 −0.002 0.185 0.034 0.968
0.5 β 1 $\beta _1$ −0.004 0.123 0.015 −0.005 0.122 0.015 0.984
β 2 $\beta _2$ 0.000 0.163 0.027 0.000 0.162 0.026 0.988
EV(0,1) 75% 0.3 β 1 $\beta _1$ −0.003 0.132 0.017 −0.002 0.132 0.017 1.000
β 2 $\beta _2$ −0.001 0.162 0.026 −0.002 0.160 0.026 0.976
0.5 β 1 $\beta _1$ −0.006 0.143 0.020 −0.005 0.141 0.020 0.972
β 2 $\beta _2$ 0.004 0.173 0.030 0.002 0.173 0.030 1.000
65% 0.3 β 1 $\beta _1$ −0.010 0.140 0.020 −0.010 0.139 0.019 0.986
β 2 $\beta _2$ −0.006 0.171 0.029 −0.008 0.170 0.029 0.989
0.5 β 1 $\beta _1$ −0.011 0.155 0.024 −0.011 0.153 0.024 0.974
β 2 $\beta _2$ −0.002 0.183 0.033 −0.002 0.184 0.034 1.011
TABLE 3. Simulation results comparing the finite-sample properties of the proposed IPCW QR estimators for univariate DC data at τ = 0.3 $\tau =0.3$ and 0.5, where the Beran (1981)'s local Kaplan–Meier (“IPCW-KM”) and survival random forests (“IPCW-RF”) methods are used to approximate the left- and right-censoring distributions given covariates.
IPCW-KM IPCW-RF
n $n$ Error Exact (%) τ $\tau$ Par Bias SSE MSE Bias SSE MSE RE
200 N(0,1) 75% 0.3 β 1 $\beta _1$ 0.004 0.131 0.017 −0.045 0.142 0.022 1.292
β 2 $\beta _2$ −0.060 0.170 0.033 −0.030 0.181 0.034 1.036
0.5 β 1 $\beta _1$ −0.004 0.120 0.014 −0.049 0.124 0.018 1.233
β 2 $\beta _2$ −0.076 0.160 0.031 −0.031 0.164 0.028 0.888
65% 0.3 β 1 $\beta _1$ 0.014 0.142 0.020 −0.039 0.154 0.025 1.240
β 2 $\beta _2$ −0.067 0.185 0.039 −0.027 0.195 0.039 1.001
0.5 β 1 $\beta _1$ 0.002 0.126 0.016 −0.060 0.131 0.021 1.307
β 2 $\beta _2$ −0.073 0.170 0.034 −0.038 0.172 0.031 0.906
EV(0,1) 75% 0.3 β 1 $\beta _1$ 0.000 0.126 0.016 −0.035 0.131 0.018 1.158
β 2 $\beta _2$ −0.056 0.154 0.027 −0.026 0.156 0.025 0.931
0.5 β 1 $\beta _1$ −0.004 0.141 0.020 −0.057 0.144 0.024 1.205
β 2 $\beta _2$ −0.082 0.172 0.036 −0.039 0.167 0.029 0.810
65% 0.3 β 1 $\beta _1$ 0.005 0.138 0.019 −0.045 0.143 0.022 1.179
β 2 $\beta _2$ −0.073 0.167 0.033 −0.034 0.169 0.030 0.895
0.5 β 1 $\beta _1$ −0.001 0.151 0.023 −0.064 0.156 0.028 1.247
β 2 $\beta _2$ −0.082 0.182 0.040 −0.047 0.179 0.034 0.860
400 N(0,1) 75% 0.3 β 1 $\beta _1$ 0.007 0.093 0.009 −0.093 0.112 0.021 2.437
β 2 $\beta _2$ −0.053 0.113 0.016 −0.055 0.124 0.018 1.181
0.5 β 1 $\beta _1$ 0.002 0.086 0.007 −0.082 0.096 0.016 2.154
β 2 $\beta _2$ −0.069 0.112 0.017 −0.034 0.116 0.015 0.844
65% 0.3 β 1 $\beta _1$ 0.014 0.102 0.011 −0.083 0.115 0.020 1.898
β 2 $\beta _2$ −0.064 0.120 0.018 −0.049 0.131 0.020 1.058
0.5 β 1 $\beta _1$ 0.009 0.091 0.008 −0.110 0.097 0.022 2.572
β 2 $\beta _2$ −0.067 0.117 0.018 −0.071 0.118 0.019 1.043
EV(0,1) 75% 0.3 β 1 $\beta _1$ 0.008 0.084 0.007 −0.062 0.095 0.013 1.807
β 2 $\beta _2$ −0.052 0.112 0.015 −0.028 0.118 0.015 0.965
0.5 β 1 $\beta _1$ 0.007 0.098 0.010 −0.082 0.094 0.016 1.612
β 2 $\beta _2$ −0.074 0.130 0.022 −0.039 0.123 0.017 0.744
65% 0.3 β 1 $\beta _1$ 0.013 0.089 0.008 −0.079 0.099 0.016 1.983
β 2 $\beta _2$ −0.066 0.121 0.019 −0.046 0.122 0.017 0.895
0.5 β 1 $\beta _1$ 0.014 0.101 0.010 −0.099 0.100 0.020 1.904
β 2 $\beta _2$ −0.075 0.140 0.025 −0.067 0.130 0.021 0.848
TABLE 4. Simulation results comparing the finite-sample properties of the IPCW estimator, corresponding to the unadjusted (weight η i = 1 $\eta _i = 1$ ), and adjusted (weight η i = 1 / c i $\eta _i=1/c_i$ ) methods for multivariate DC data at τ = 0.3 $\tau =0.3$ and 0.5, where the numbers of clusters are 50 and 100.
Unadjusted ( η i = 1 $\eta _i=1$ ) Adjusted ( η i = 1 / c i $\eta _i=1/c_i$ )
Cluster Error Exact (%) τ $\tau$ Par Bias SSE ASE CP Bias SSE ASE CP
n = 50 $n = 50$ N(0,1) 75% 0.3 β 1 $\beta _1$ −0.008 0.221 0.241 0.955 −0.001 0.237 0.239 0.936
β 2 $\beta _2$ −0.034 0.271 0.301 0.962 −0.025 0.294 0.297 0.938
0.5 β 1 $\beta _1$ −0.011 0.261 0.282 0.960 −0.002 0.287 0.279 0.939
β 2 $\beta _2$ −0.040 0.324 0.350 0.973 −0.031 0.343 0.345 0.957
65% 0.3 β 1 $\beta _1$ −0.014 0.246 0.261 0.945 −0.006 0.262 0.260 0.940
β 2 $\beta _2$ −0.036 0.297 0.326 0.965 −0.028 0.319 0.323 0.941
0.5 β 1 $\beta _1$ −0.012 0.300 0.319 0.961 0.000 0.324 0.317 0.947
β 2 $\beta _2$ −0.046 0.368 0.396 0.967 −0.034 0.392 0.392 0.955
EV(0,1) 75% 0.3 β 1 $\beta _1$ −0.023 0.232 0.255 0.955 −0.016 0.246 0.252 0.945
β 2 $\beta _2$ −0.011 0.309 0.321 0.948 −0.004 0.325 0.317 0.934
0.5 β 1 $\beta _1$ −0.033 0.299 0.323 0.950 −0.024 0.320 0.319 0.938
β 2 $\beta _2$ −0.013 0.380 0.406 0.965 −0.003 0.399 0.397 0.946
65% 0.3 β 1 $\beta _1$ −0.024 0.252 0.273 0.953 −0.018 0.267 0.270 0.937
β 2 $\beta _2$ −0.014 0.335 0.344 0.946 −0.003 0.356 0.339 0.932
0.5 β 1 $\beta _1$ −0.032 0.341 0.374 0.943 −0.021 0.361 0.371 0.942
β 2 $\beta _2$ −0.019 0.438 0.469 0.968 −0.009 0.462 0.461 0.957
n = 100 $n = 100$ N(0,1) 75% 0.3 β 1 $\beta _1$ −0.015 0.149 0.168 0.956 −0.008 0.162 0.167 0.943
β 2 $\beta _2$ −0.020 0.200 0.211 0.954 −0.016 0.210 0.210 0.940
0.5 β 1 $\beta _1$ −0.015 0.173 0.194 0.977 −0.005 0.185 0.193 0.952
β 2 $\beta _2$ −0.026 0.227 0.244 0.965 −0.021 0.241 0.242 0.945
65% 0.3 β 1 $\beta _1$ −0.021 0.163 0.182 0.964 −0.014 0.179 0.181 0.946
β 2 $\beta _2$ −0.025 0.221 0.229 0.956 −0.017 0.233 0.227 0.937
0.5 β 1 $\beta _1$ −0.017 0.197 0.218 0.969 −0.010 0.212 0.218 0.949
β 2 $\beta _2$ −0.029 0.264 0.275 0.965 −0.020 0.275 0.273 0.956
EV(0,1) 75% 0.3 β 1 $\beta _1$ −0.019 0.164 0.178 0.959 −0.016 0.179 0.178 0.946
β 2 $\beta _2$ −0.014 0.216 0.226 0.958 −0.010 0.229 0.224 0.941
0.5 β 1 $\beta _1$ −0.019 0.206 0.226 0.961 −0.016 0.221 0.225 0.941
β 2 $\beta _2$ −0.022 0.267 0.283 0.954 −0.016 0.280 0.280 0.941
65% 0.3 β 1 $\beta _1$ −0.023 0.174 0.191 0.957 −0.018 0.191 0.190 0.936
β 2 $\beta _2$ −0.018 0.230 0.242 0.961 −0.015 0.246 0.239 0.930
0.5 β 1 $\beta _1$ −0.021 0.238 0.259 0.961 −0.015 0.257 0.257 0.947
β 2 $\beta _2$ −0.019 0.315 0.326 0.954 −0.012 0.333 0.323 0.931

The simulation results for DC and PIC are summarized in Table 1, which includes empirical bias (Bias), sampling standard error (SSE), an average of standard error estimates (ASE), and coverage probabilities (CP) of the 95 % $95\%$ confidence intervals for β ̂ $\hat{\bm{\beta }}$ , based on 1000 random data sets with sample sizes n = 200 $n = 200$ and 400. Overall, the proposed estimator is unbiased, and the standard error estimates from induced smoothing are close to their empirical estimates. The empirical CPs agree well with the nominal level approximated by the normal distribution. The estimated standard errors are slightly larger than the sampling errors, but their gaps appear to decrease as the sample size increases. Next, the performance of the IPCW and AIPCW estimators is compared for DC data. In addition to Bias and SSE, Table 2 presents the mean-squared error (MSE) of IPCW ( β ̂ $\hat{\bm{\beta }}$ ) and AIPCW ( β ̂ $\hat{\bm{\beta }}^*$ ), along with their relative efficiency (RE), defined as MSE ( β ̂ ) / MSE ( β ̂ ) $\text{MSE}(\hat{\bm{\beta }}^*)/\text{MSE}(\hat{\bm{\beta }})$ . We observe that β ̂ $\hat{\bm{\beta }}^*$ is unbiased and significantly more efficient than β ̂ $\hat{\bm{\beta }}$ . The efficiency gain in this setting is meaningful, though modest, and could potentially increase with the availability of further time-dependent or longitudinal information (Gorfine, Goldberg, and Ritov 2017).

Table 3 reports additional simulation results under univariate DC data when the censoring distributions also involve covariates. We let L 10 + U ( 4.2 , c L ) $L \! \sim \! 10 + U(-4.2, c_L)$ and R L $R \sim L$ + ( 1 0.2 x 1 0.2 x 2 ) × U ( 4.1 , c R ) $(1-0.2x_1-0.2x_2) \times U(4.1, c_R)$ , while other simulation configurations remain the same as before. To account for this covariate-conditional censoring situation, we apply local KM (Beran 1981) and survival random forests (Ishwaran et al. 2008) methods, as mentioned in Remark 2, to approximate S L ( · | x ) $S_L(\cdot |{\bf x})$ and S R ( · | x ) $S_R(\cdot |{\bf x})$ . The corresponding estimators are referred to as IPCW-KM and IPCW-RF, respectively. Overall, both estimators produce virtually unbiased results that are robust to the effect of covariates on censoring distributions. Table 3 also presents RE, defined as the ratio of the MSE of IPCW-RF to that of IPCW-KM. In the present setting, it seems that the IPCW-KM estimator is slightly more efficient than the IPCW-RF estimator. However, if the censoring distributions involve many covariates such that nonparametric kernel-smoothing is not feasible, the random forests method would be a more viable and reliable alternative.

5.2 Multivariate Partially Interval-Censored Data

Next, we present the simulation results under clustered multivariate DC data. We set the number of clusters as either n = 50 $n=50$ or n = 100 $n=100$ . The cluster size c i $c_i$ is determined by c i = ( d / 10 ) + 3 $c_i=(d/10)+3$ , if d = 0 , 10 , , 90 $d=0,10,\ldots,90$ satisfies l d v i < l d + 10 $l_d \le v_{i} &lt; l_{d+10}$ for v i N ( 0 , 1 ) $v_i\sim N(0,1)$ , otherwise we let c i = 5 $c_i=5$ , where l d $l_d$ represents the d $d$ th percentile of v i $v_{i}$ . In this setup, the cluster size c i $c_i$ ranges from 3 to 11 members and the total number of members is about 200 when n = 50 $n = 50$ , and 400 when n = 100 $n = 100$ . The data-generating model is given by T i k = 10 + β 1 ( τ ) x 1 i k + β 2 ( τ ) x 2 i k + v i + e i k ( τ ) q i k ( τ ) $T_{ik} = 10 + \beta _1(\tau) x_{1ik} + \beta _2(\tau)x_{2ik} + v_i + e_{ik}(\tau)-q_{ik}(\tau)$ for subject k = 1 , 2 , , c i $k=1,2,\ldots,c_i$ in cluster i = 1 , 2 , , n $i=1,2,\ldots,n$ , where ( β 01 , β 02 ) T = ( 1 , 1 ) T $(\beta _{01},\beta _{02})^T=(1,1)^T$ , and e i k ( τ ) $e_{ik}(\tau)$ follows N(0,1) and EV(0,1) distribution, satisfying P ( e i k ( τ ) < q i k ( τ ) ) = τ $P(e_{ik}(\tau) &lt; q_{ik}(\tau))=\tau$ . As in the first simulation, we let L 10 + U ( 4.2 , c L ) $L \sim 10 + U(-4.2, c_L)$ and R L $R \sim L$ + U ( 4.1 , c R ) $ U(4.1, c_R)$ . We consider η i = 1 $\eta _i=1$ (unadjusted) and η i = 1 / c i $\eta _i=1/c_i$ (adjusted); the latter approach may calibrate possible informativeness of cluster sizes on event time. Table 4 shows that the cluster size adjustment with η i = 1 / c i $\eta _i=1/c_i$ could lead to slightly lower biases but a bit more inflated standard errors. When the cluster size is adjusted, and censoring rates are higher, the estimated standard errors are closer to the empirical standard errors, resulting in more stabilized CPs. When cluster sizes are highly informative to time-to-event, letting η i = 1 / c i α $\eta _i=1/c_i^\alpha$ for some 0 < α 1 $0&lt;\alpha \le 1$ would be beneficial to achieve a more efficient and robust estimation (Wang and Zhao 2008).

6 Application: mCRC Data

In this section, we apply the proposed method to a data set from a multicentered, randomized, phase III mCRC clinical trial (Peeters et al. 2010). This study aimed to investigate the efficacy and safety of second-line panitumumab plus FOLFIRI versus FOLFIRI alone, concerning patients' survival after the failure of initial treatment for mCRC. Panitumumab is a fully human, antiepidermal growth factor receptor, monoclonal antibody that improves PFS in chemotherapy-refractory mCRC. It was often prescribed with FOLFIRI because it does not benefit clinically alone. From June 2006 to March 2008, 1186 patients who failed first-line treatment of mCRC were randomly assigned (1:1) to panitumumab 6.0 mg/kg plus FOLFIRI versus FOLFIRI alone every 2 weeks. The coprimary end points of PFS and overall survival (OS) were independently tested and prospectively analyzed by KRAS status.

Our analysis focused on 855 patients concerning PFS, for whom treatment and KRAS status were available: 428 (50.0%) and 427 (50.0%) patients received FOLFIRI (coded as 0) and panitumumab + FOLFIRI (coded as 1), respectively, while 474 (55.4%) had wild-type (WT) KRAS tumors (coded as 1) and 381 (44.5%) had mutant (MT) KRAS tumors (coded as 0). Eligible patients, aged 18 or older and diagnosed with adenocarcinoma of the colon or rectum, with an Eastern Cooperative Oncology Group (ECOG) performance status of 0, 1, or 2, were included. They had received only one prior chemotherapy regimen for mCRC, with radiographically confirmed disease progression occurring during or within 6 months of the prior first-line chemotherapy. Patients meeting these criteria underwent central analysis of EGFR and biomarkers with approval from an independent ethics committee before any study-related procedures were initiated. Patients in this study were followed for safety for $\ge$ 30 days after the last study drug administration and for survival every 3 months. Due to this nature of data administration, the disease progression-free period in each patient was subject to various types of interval-censoring: 168 (19.6%), 329 (38.5%), and 306 (35.8%) patients were left-censored, interval-censored, and right-censored, respectively. Exact disease progression times were known only for 52 (6.1%) patients.

Since this data set was collected from 185 clinic centers with a range of 1–23 patients in each center, it can be understood as general multivariate PIC data. Figure 1, computed using a modified self-consistency approach for general interval-censored data (Choi, Kim, and Choi 2021), displays the nonparametric PFS curves corresponding to panitumumab + FOLFIRI versus and FOLFIRI alone groups. We observe that panitumumab + FOLFIRI can achieve higher survival rates than FOLFIRI alone during the first year, but the two KM curves become almost identical about 1.25 years into treatment. Previously, a Bayesian evaluation using standard univariate PH model (Pan, Cai, and Wang 2020) (and ignoring cluster effects) revealed that the treatment effect is statistically significant (Coef = –0.215; CI = –0.384, –0.046), while the KRAS status is not significant (Coef = 0.163; CI = –0.006, 0.332).

Details are in the caption following the image
Nonparametric Kaplan–Meier curves (based on a self-consistency equation) estimating progression-free survival probabilities, for the “panitumumab+FOLFIRI” versus “FOLFIRI” groups in the mCRC data.
By considering the potential correlation within each clinical site, we alternatively fitted the following multivariate CQR model for the log-transformed PFS with two covariates:
log-PFS i k = β 0 ( τ ) + β 1 ( τ ) × TRT i k + β 2 ( τ ) × KRAS i k + e i k ( τ ) , i = 1 , , 185 , k = 1 , , c i , $$\begin{align*} \text{log-PFS}_{ik} = \beta _0(\tau) + \beta _1(\tau) \times \text{TRT}_{ik} & +\beta _2(\tau)\times \text{KRAS}_{ik} + e_{ik}(\tau),\nobreakspace\\ i & = 1,\ldots,185,\nobreakspace k=1,\ldots,c_i, \end{align*}$$
via the proposed IPCW approach, with the cluster-size adjustment weights η i = 1 $\eta _i=1$ (unadjusted), or η i = 1 / c i $\eta _i=1/c_i$ (adjusted). As a preliminary analysis, we first applied Cox's PH model, respectively, to the left endpoint ( U $U$ ) and right endpoint ( V $V$ ) of observed time intervals to check whether their distributions depend on any covariates. We found that the effects of the two covariates on both U $U$ and V $V$ were distinctly significant at the significance level of 0.1. Furthermore, we used Beran (1981)'s local KM estimates to compute the desired individual weights given the covariates. Standard errors in this case were computed via a cluster-wise bootstrapping method with 100 bootstrap samples.

Figure 2 presents the point estimates and 95% Wald-type confidence intervals for two covariates at different quantile levels of τ [ 0.1 , 0.9 ] $\tau \in [0.1,0.9]$ , when the cluster size is adjusted or not. Overall, panitumumab + FOLFIRI does not improve PFS significantly at most quantile levels, and also the difference in KRAS status is not statistically significant, with or without the adjustment of the cluster effect. Panitumumab + FOLFIRI appears to be more effective than FOLFIRI alone in controlling disease progression only at low quantile levels. This observation can also be confirmed by the KM plot in Figure 1, which shows that panitumumab + FOLFIRI can improve PFS only for the first year after treatment. Note that the analysis results do not change significantly whether or not the cluster size is adjusted. However, comparisons of cluster size adjustments for treatment (a vs. b), and KRAS status (c vs. d) in Figure 2 reveal that the quantile coefficients are much more smoothly distributed when the cluster size is adjusted. This implies that some heterogeneity may exist across different clinical sites, and the cluster size adjustment would help achieve standardized results.

Details are in the caption following the image
Estimated QR coefficients (black curves) of treatment and KRAS status, with corresponding 95% CI estimates (light gray dashed curves), when the cluster size is adjusted or unadjusted.

A drawback of the proposed IPCW estimator is that the estimation procedure only utilizes complete survival time data, and the information from censored observations is used to compute the inverse weight but does not effectively contribute to estimation and statistical efficiency. Since the proportion of individuals with exact PFS time is only 6.1% in this data set, the IPCW approach is expected to produce unbiased results but with low statistical precision. This may partly explain why our results for treatment are slightly different from those of the univariate Cox PH analysis. One might consider an augmentation-based estimation method, but with censoring variables that depend on baseline covariates (as in our case), its derivation is too complicated, and practically not feasible. Nevertheless, our CQR procedure is computationally reliable and not much sensitive to the sample size.

7 Discussion

This paper proposes an IPCW-based estimation method for conducting QR on partially interval-censored data, primarily focusing on DC and PIC endpoints. We demonstrate that the nonparametric left-censored survivor function can be estimated with conventional KM approaches by reading survival data backward in time. Furthermore, we develop an augmentation-based estimation and extend the method to accommodate multivariate partially interval-censored data. The proposed methods can easily be implemented with existing computation packages for QR or l 1 $l_1$ -type linear programming. Although we restrict our attention to interval-CQR at a single quantile level, the proposed weighting scheme can be immediately applied to other settings with interval-censoring, such as medical costs (Bang and Tsiatis 2002), competing risks (Choi, Kang, and Huang 2018), time-dependent covariates (Gorfine, Goldberg, and Ritov 2017), AFT model (Komárek and Lesaffre 2008; Gao, Zeng, and Lin 2017; Choi, Kim, and Choi 2021), and composite QR (Zou and Yuan 2008), and so forth.

As pointed out by a reviewer, the quantile level ( τ $\tau$ ) of interest will be determined by investigators. Depending on the nature of a study, one can choose a desired τ $\tau$ , but quantiles at extreme levels, such as τ = 0.01 $\tau =0.01$ or 0.99, may not be well-estimated unless sample sizes are justifiably large. Even though the data are subject to a certain type of censoring, the underlying distribution function can be well-identified, and the quantile point corresponding to each quantile level can be estimated unless the censoring rate is too heavy, or presence of other complications, such as competing risks, and so forth.

One of the necessary requirements of the IPCW-based approach is a nonnegligible proportion of exact failure time observations, which is also crucial in establishing asymptotic results of the proposed estimator and constructing computational algorithms. This is because the IPCW approach typically compensates for censored subjects by giving more weight to subjects with similar characteristics who are not censored. Thus, the proposed method may not be applied to fully interval-censored or current status data without known failure time data. Our experience is that our QR procedure is not much sensitive to the level of censoring rates, but high censoring rates may lead to loss of statistical precision. In this case, the augmentation-based estimator can perform better than the IPCW estimator in both estimation and statistical inference, but not much significantly. In our data example, the proportion of the “effective” data samples was only 6.1%, and as a result, the IPCW-based estimators are less statistically efficient than Cox PH estimators.

Recently, De Backer, Ghouch, and Van Keilegom (2019) proposed an alternative estimating approach for CQR with an adapted quantile loss function. For right-censored data, they argued that a consistent estimator of the QR coefficient β ( τ ) $\bm{\beta }(\tau)$ could be obtained by minimizing the following objective function:
L n ( β , τ ) = n 1 / 2 i = 1 n ρ τ ( T i x i T β ) ( 1 τ ) x i T β F ̂ R ( t | x i ) d t . $$\begin{eqnarray} \tilde{L}_n(\bm{\beta },\tau)=n^{-1/2} \sum _{i=1}^n{\left\lbrace \rho _\tau (\tilde{T}_i-{\bf x}_i^T\bm{\beta })-(1-\tau) \int _{-\infty }^{{\bf x}_i^T\bm{\beta }} \hat{F}_R(t|{\bf x}_i) dt \right\rbrace} . \nonumber\\ \end{eqnarray}$$ (13)
Notice that formulation (13) allows us to extract the information of every observation at hand, even if confronted with incompleteness from right-censoring. Therefore, one could expect that the solution to (13) will be more efficient than the basic IPCW estimator, especially when the censoring proportion is large. In the same spirit, we may construct an alternative quantile loss function for DC data as
L n ( β , τ ) = n 1 / 2 i = 1 n ρ τ ( T i x i T β ) ( 1 τ ) x i T β F ̂ R ( t | x i ) d t + τ x i T β S ̂ L ( t | x i ) d t , $$\begin{eqnarray} L_n(\bm{\beta },\tau) &=& n^{-1/2} \displaystyle \sum _{i=1}^n{\left[\rho _\tau (\tilde{T}_i-{\bf x}_i^T\bm{\beta })- (1-\tau)\displaystyle \int _{-\infty }^{{\bf x}_i^T\bm{\beta }}\hat{F}_R(t|{\bf x}_i)dt\right.}\nonumber\\ &&{\left. +\, \tau \displaystyle \int _{-\infty }^{{\bf x}_i^T\bm{\beta }} \hat{S}_L(t|{\bf x}_i)dt \right]}, \end{eqnarray}$$ (14)
where we leverage the fact that under the independence assumption of T G | x $T \perp \!\!\!\perp G|{\bf x}$ , the following equality holds:
E [ I ( T > t ) | x ] = E [ { 1 τ I ( L < t ) } I ( R > t ) | x ] = ( 1 τ ) S R ( t | x ) + τ S L ( t | x ) $$\begin{align*} E[I(\tilde{T}&gt;t)|{\bf x}] &= E[\lbrace 1-\tau I(L&lt;t)\rbrace I(R&gt;t)|{\bf x}] \\& =\,(1-\tau)S_R(t|{\bf x})+\tau S_L(t|{\bf x}) \end{align*}$$
for T = ( T L ) R $\tilde{T}=(T\vee L)\wedge R$ . The theoretical and empirical properties of the new estimator from the adapted quantile loss function (14) deserve further investigation and will be studied in future research.

Acknowledgments

The authors thank the anonymous AE and two reviewers, whose insightful comments led to a substantially improved presentation of the manuscript. The colorectal cancer data were derived based on raw data sets obtained from www.projectdatasphere.org, which is maintained by Project Data Sphere, LLC. Neither Project Data Sphere, LLC nor the owner(s) of any information from the website have contributed to, approved, or are in any way responsible for the contents of this publication. The research of Dr. T. Choi was supported by a grant from the National Research Foundation (NRF) of Korea (RS-2024-00340298). The research of Dr. S. Choi was supported by a Korea University grant (K2201231) and a grant from the National Research Foundation (NRF) of Korea (2022M3J6A1063595, 2022R1A2C1008514). Dr. Bandyopadhyay acknowledges partial funding support from the grants awarded by the US National Institutes of Health (R21DE031879, R01DE031134).

    Conflicts of Interest

    The authors declare no conflicts of interest.

    Appendix A: Asymptotic Results

    This section provides asymptotic results of the proposed estimator for doubly censored (DC) data. We first impose the following regularity conditions:
    • (C1) The joint distribution function of ( L , R ) $ (L,R)$ is continuous. There exists v ( 0 , ) $ v\in (0,\infty)$ such that P ( R L > v | x ) = 1 $ P(R-L &gt; v |{\bf x}) =1$ . There also exist < v 1 v 2 v < $ -\infty &lt; v_1 \le v_2 \le v&lt;\infty$ such that P ( v 1 < L v 2 | x ) = 1 $ P(v_1 &lt; L\le v_2|{\bf x}) = 1$ and P ( R v | x ) = 1 $ P(R\le v | {\bf x}) = 1$ .
    • (C2) The covariate x $ {\bf x}$ is uniformly bounded, that is, sup i x i < $ \sup _i \Vert {\bf x}_i \Vert &lt;\infty$ .
    • (C3) (i) The quantile coefficient β 0 ( τ ) $ \bm{\beta }_0(\tau)$ is Lipschitz continuous for τ [ τ L , τ R ] $ \tau \in [\tau _L, \tau _R]$ ; (ii) f ( t | x ) $ f(t|{\bf x})$ is bounded above uniformly in t $ t$ and x $ {\bf x}$ , where f ( t | x ) = d F ( t | x ) / d t $ f(t|{\bf x}) = dF(t|{\bf x})/dt$ .
    • (C4) For some ρ 0 > 0 $ \rho _0 &gt; 0$ and c 0 > 0 $ c_0 &gt; 0$ , inf β B ( ρ 0 ) eigmin ( A { β ( τ ) } ) c 0 $ \inf _{\bm{\beta }\in \mathbb {B}(\rho _0)}\text{eigmin} \nobreakspace ({\bf A}\lbrace \bm{\beta }(\tau)\rbrace) \ge c_0$ , where B ( ρ ) = { inf τ [ τ L , τ R ] β ( τ ) β 0 ( τ ) ρ : β R p } $ \mathbb {B}(\rho) = \lbrace \inf _{\tau \in [\tau _L,\tau _R]} \Vert \bm{\beta }(\tau)-\bm{\beta }_0(\tau)\Vert \le \rho:\bm{\beta }\in \mathbb {R}^{p} \rbrace $ and A { β ( τ ) } = E [ x 2 f ( x T β | x ) ] $ {\bf A}\lbrace \bm{\beta }(\tau)\rbrace = E[{\bf x}^{\otimes 2} f({\bf x}^T\bm{\beta }|{\bf x})]$ where eigmin ( · ) $(\cdot)$ denotes the minimum eigenvalue of a matrix.

    In the following, we omit τ $ \tau$ in β ̂ ( τ ) $ \hat{\bm{\beta }}(\tau)$ for notation simplicity but bear in mind that coefficients are all τ $\tau$ -specific. To avoid tail instability, we restrict the possible range of τ $ \tau$ as 0 < τ L τ τ R < 1 $ 0&lt;\tau _L\le \tau \le \tau _R&lt;1$ . We first fix several notations for establishing our asymptotic results. For right-censoring, define N i R ( t ) = I ( T i t , δ 2 i = 1 ) , Y i ( t ) = I ( T i t ) $ N_i^R(t) = I(\tilde{T}_i \le t, \delta _{2i}=1), Y_i(t) = I(\tilde{T}_i \ge t)$ , and y ( t ) = P ( T t ) $ y(t) = P(\tilde{T} \ge t)$ . Then, we observe the corresponding Martingale process M i R ( t ) = N i R ( t ) t Y i ( u ) d Λ R ( u ) $M_i^R(t) = N_i^R(t) - \int _{-\infty }^tY_i(u)d\Lambda ^R(u)$ , where Λ R ( t ) = t λ R ( u ) d u , λ R ( t ) = lim h 0 P { T ( t , t + h ) | T t } / h $\Lambda ^R(t) = \int _{-\infty }^t\lambda ^R(u)du,\nobreakspace \lambda ^R(t) = \lim _{h\rightarrow 0}P\lbrace \tilde{T}\in (t,t+h)|\tilde{T}\ge t \rbrace /h$ . For left-censoring, define N i L ( t ) = I ( T i t , δ 3 i = 1 ) $ N_i^L(t) = I(\tilde{T}_i \ge t, \delta _{3i}=1)$ , with the corresponding Martingale process M i L ( t ) = N i L ( t ) t { n + 1 Y i ( u ) } d Λ L ( u ) $ M_i^L(t) = N_i^L(t) - \int _{t}^\infty \lbrace n+1-Y_i(u)\rbrace d\Lambda ^L(u)$ , where Λ L ( t ) = t λ L ( u ) d u , λ L ( t ) = lim h 0 P { T ( t h , t ) | T t } / h $ \Lambda ^L(t) = \int _t^\infty \lambda ^L(u)du,\nobreakspace \lambda ^L(t) = \lim _{h\rightarrow 0}P\lbrace \tilde{T}\in (t-h,t)|\tilde{T}\le t \rbrace /h$ .

    Theorem A.1.Under regularity conditions (C1)–(C4), lim n sup τ [ τ L , τ R ] β ̂ ( τ ) β 0 ( τ ) p 0 $ \lim _{n\rightarrow \infty } \sup _{\tau \in [\tau _L,\tau _R]} \Vert \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\Vert \rightarrow _{p} 0$ , assuming model (2) holds for τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ .

    Proof.Define U n S ( β , τ ) = n 1 / 2 i = 1 n ξ 1 , i ( τ ) $ {\bf U}_n^S(\bm{\beta },\tau) = n^{-1/2}\sum _{i=1}^n{\bm \xi }_{1,i}(\tau)$ and U 0 ( β , τ ) = E { n 1 / 2 U n F ( β , τ ) } ${\bf U}_0(\bm{\beta },\tau) = E\lbrace n^{-1/2}{\bf U}_n^F(\bm{\beta },\tau)\rbrace $ , where x i 1 , i ( τ ) = x i { δ 1 i { S R ( T i ) S L ( T i ) } 1 I ( T i x i T β ) τ } $ {\bm xi}_{1,i}(\tau) = {\bf x}_i \lbrace {\delta _{1i}}\lbrace S_R(\tilde{T}_i) - S_L(\tilde{T}_i) \rbrace ^{-1} I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }) - \tau \rbrace$ and U n F ( β , τ ) = n 1 / 2 i = 1 n x i { F ( x i T β | x i ) τ } ${\bf U}_n^F(\bm{\beta },\tau) = n^{-1/2}\sum _{i=1}^n{\bf x}_i\lbrace F({\bf x}_i^T\bm{\beta }|{\bf x}_i) -\tau \rbrace $ . In the sequel, we use sup β $ \sup _{\bm{\beta }}$ and sup τ $ \sup _{\tau }$ to denote supremum taken over β R p $ \bm{\beta }\in \mathbb {R}^{p}$ and τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ , respectively.

    First, by condition (C1), for every 0 < r < 1 / 2 $0&lt; r&lt;1/2$ , we have sup t < v | S ̂ R ( t ) S R ( t ) | = o p ( n 1 / 2 + r ) $ \sup _{t&lt;v}| \hat{S}_R(t) -S_R(t)| = o_p(n^{-1/2+ r})$ and sup t < v | S ̂ L ( t ) S L ( t ) | = o p ( n 1 / 2 + r ) $ \sup _{t&lt;v}| \hat{S}_L(t) -S_L(t)| = o_p(n^{-1/2+ r})$ . This, allied with condition (C2), implies that

    sup β , τ n 1 / 2 { U n ( β , τ ) U n S ( β , τ ) } = o p ( n 1 / 2 + r ) . $$\begin{equation*} \sup _{\bm{\beta },\tau } \Vert n^{-1/2}\lbrace {\bf U}_n(\bm{\beta },\tau)-{\bf U}_n^S(\bm{\beta },\tau)\rbrace \Vert = o_p(n^{-1/2+r}). \end{equation*}$$
    Define A = { x i { δ 1 i { S R ( T i ) S L ( T i ) } 1 I ( T i x i T β ) τ } : β R p , τ [ τ L , τ R ] } $ \mathcal {A}= \lbrace {\bf x}_i \lbrace \delta _{1i} \lbrace S_R(\tilde{T}_i) -S_L(\tilde{T}_i)\rbrace ^{-1}I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }) - \tau \rbrace:\bm{\beta }\in \mathbb {R}^{p},\tau \in [\tau _L,\tau _R]\rbrace$ . This function class is Donsker and thus Glivenko–Cantelli (van der Vaart and Wellner 1996), because the class of indicator functions is Donsker and three x i $ {\bf x}_i$ , S R ( T i ) $ S_R(\tilde{T}_i)$ , and S L ( T i ) $ S_L(\tilde{T}_i)$ are uniformly bounded. Therefore, from the Glivenko–Cantelli theorem, we have that sup β , τ n 1 / 2 U n S ( β , τ ) U 0 ( β , τ ) = o p ( 1 ) $\sup _{\bm{\beta },\tau }\Vert n^{-1/2} {\bf U}_n^S(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta },\tau)\Vert = o_p(1)$ . Combining these two results, we obtain
    sup β , τ n 1 / 2 U n ( β , τ ) U 0 ( β , τ ) = o p ( 1 ) . $$\begin{equation} \sup _{\bm{\beta },\tau } \Vert n^{-1/2}{\bf U}_n(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta },\tau) \Vert = o_p(1). \end{equation}$$ (A.1)

    Second, note that for any b R p $ {\bf b}\in \mathbb {R}^{p}$ satisfying b = 1 $ \Vert {\bf b}\Vert =1$ , b T U 0 ( β 0 + b δ , τ ) $ {\bf b}^T{\bf U}_0(\bm{\beta }_0 + {\bf b}\delta,\tau)$ is a nondecreasing function in δ $ \delta$ . Then, for δ ρ 0 > 0 $ \delta \ge \rho _0&gt;0$ , b T [ U 0 ( β 0 + b δ , τ ) U 0 ( β 0 , τ ) ] b T [ U 0 ( β 0 + b ρ 0 , τ ) U 0 ( β 0 , τ ) ] 0 ${\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ] \ge {\bf b}^T [{\bf U}_0(\bm{\beta }_0 + {\bf b}\rho _0, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ]\ge 0$ . By the Cauchy–Schwarz inequality and condition (C4),

    U 0 ( β 0 + b δ , τ ) U 0 ( β 0 , τ ) 2 · b 2 ( b T [ U 0 ( β 0 + b δ , τ ) U 0 ( β 0 , τ ) ] ) 2 ( b T [ U 0 ( β 0 + b ρ 0 , τ ) U 0 ( β 0 , τ ) ] ) 2 = ( b T A ( β 0 + b ρ ) b ) 2 ρ 0 2 c 0 2 ρ 0 2 > 0 $$\begin{align*} & \Vert {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau)\Vert ^2 \cdot \Vert {\bf b}\Vert ^2 \ge ({\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ])^2\\ & \quad \ge ({\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\rho _0, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ])^2\\ & \quad = ({\bf b}^T {\bf A}(\bm{\beta }_0 + {\bf b}\rho ^*){\bf b})^2\rho _0^2 \nobreakspace \ge \nobreakspace c_0^2\rho _0^2\nobreakspace &gt;\nobreakspace 0 \end{align*}$$
    for some ρ [ 0 , ρ 0 ] $ \rho ^*\in [0,\rho _0]$ . Since β 0 + b ρ B ( ρ 0 ) $ \bm{\beta }_0 + {\bf b}\rho ^* \in \mathbb {B}(\rho _0)$ , the last above inequality follows from condition (C4). Therefore, we have inf β B ( ρ 0 ) U 0 ( β , τ ) U 0 ( β 0 , τ ) c 0 ρ 0 $\inf _{\bm{\beta }\not\in \mathbb {B}(\rho _0)} \Vert {\bf U}_0(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta }_0,\tau)\Vert \ge c_0\rho _0$ .

    By using the fact U n ( β ̂ , τ ) = o p ( n 1 / 2 ) ${\bf U}_n(\hat{\bm{\beta }},\tau) = o_p(n^{-1/2})$ , U 0 ( β 0 , τ ) = 0 ${\bf U}_0(\bm{\beta }_0, \tau) =0$ , and (A.1), we can easily show that

    U 0 ( β ̂ , τ ) U 0 ( β 0 , τ ) = o p ( 1 ) , $$\begin{equation} {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) = o_p(1), \end{equation}$$ (A.2)
    and thus there exists an N 0 > 0 $ N_0 &gt;0$ such that for n N 0 $ n\ge N_0$ , sup τ U 0 ( β ̂ , τ ) U 0 ( β 0 , τ ) < c 0 ρ 0 $ \sup _\tau \Vert {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) \Vert &lt; c_0\rho _0$ . Consequently, for τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ , β ̂ $ \hat{\bm{\beta }}$ belongs to the B ( ρ 0 ) $ \mathbb {B}(\rho _0)$ with probability one when n $ n$ is large enough. Moreover, using Taylor expansion of U 0 ( β ̂ , τ ) $ {\bf U}_0(\hat{\bm{\beta }},\tau)$ with respect to β 0 $ \bm{\beta }_0$ yields
    sup τ β ̂ β 0 = sup τ A ( β ̌ ) 1 { U 0 ( β ̂ , τ ) U 0 ( β 0 , τ ) } , $$\begin{equation*} \sup _\tau \Vert \hat{\bm{\beta }}- \bm{\beta }_0\Vert = \sup _\tau \big \Vert {\bf A}(\check{\bm{\beta }})^{-1} \lbrace {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) \rbrace \big \Vert, \end{equation*}$$
    where β ̌ $ \check{\bm{\beta }}$ is between β ̂ $ \hat{\bm{\beta }}$ and β 0 $ \bm{\beta }_0$ and thus be the element of B ( ρ 0 ) $ \mathbb {B} (\rho _0)$ for a large n $ n$ . Therefore, the desired uniform consistency can be derived by applying (A.2) and condition (C4) to the above display. $\Box$

    Lemma A.1.For any positive sequence { d n } n = 1 $ \lbrace d_n\rbrace _{n=1}^\infty$ satisfying d n 0 $ d_n\rightarrow 0$ ,

    lim n sup β , β B ( ρ 0 ) , β β d n n 1 / 2 i = 1 n x i δ 1 i { I ( T i x i T β ) I ( T i x i T β ) } n 1 / 2 { U 0 ( β , τ ) U 0 ( β , τ ) } = 0 , a . s . $$\begin{align*} & \lim _{n\rightarrow \infty }\sup _{\bm{\beta },\bm{\beta }^{\prime }\in \mathbb {B}(\rho _0),\Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert \le d_n} \Vert n^{-1/2}\sum _{i=1}^n {\left[{\bf x}_i \delta _{1i} \lbrace I(T_i\le {\bf x}_i^T\bm{\beta })-I(T_i\le {\bf x}_i^T\bm{\beta }^{\prime })\rbrace \right]} \\ &\! -n^{1/2} \lbrace {\bf U}_0(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta }^{\prime },\tau)\rbrace \Vert = 0,\nobreakspace a.s. \end{align*}$$

    Proof.This lemma can be proved by using the results in Alexander (1984) and similar arguments from Theorem 1 of Lai and Ying (1988). Thus, the detailed derivation is omitted. It is noted that there exist S R ( T i ) > 0 $ S_{R}(\tilde{T}_i)&gt;0$ and S L ( T i ) > 0 $S_{L}(\tilde{T}_i)&gt;0$ such that

    var ( x [ I ( T i x i T β ) δ 1 i I ( T i x i T β ) δ 1 i ] | S R ( T i ) | · β β , $$\begin{equation*} \text{var}({\bf x}[I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })\delta _{1i}-I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }^{\prime })\delta _{1i}] \le |S_{R}(\tilde{T}_i)|\cdot \Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert, \end{equation*}$$
    and
    var ( x [ I ( T i x i T β ) δ 1 i I ( T i x i T β ) δ 1 i ] | S L ( T i ) | · β β . $$\begin{equation*} \text{var}({\bf x}[I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })\delta _{1i}-I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }^{\prime })\delta _{1i}] \le |S_{L}(\tilde{T}_i)|\cdot \Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert . \end{equation*}$$
    This would be proved using the boundedness properties of x ${\bf x}$ and B ( ρ 0 ) $ \mathbb {B} (\rho _0)$ from conditions (C2) and (C3). $\Box$

    Theorem A.2.Under regularity conditions (C1)–(C4), n 1 / 2 { β ̂ ( τ ) β 0 ( τ ) } $ n^{1/2} \lbrace \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\rbrace $ weakly converges to zero-mean Gaussian process for τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ with covariance

    Ψ ( τ , τ ) = A { β 0 ( τ ) } 1 E { x i 1 ( τ ) x i 1 ( τ ) T } ( A { β 0 ( τ ) } 1 ) T . $$\begin{equation*} \Psi (\tau ^{\prime },\tau) = {\bf A}\lbrace \bm{\beta }_0(\tau ^{\prime })\rbrace ^{-1} E\lbrace {\bm xi}_1(\tau ^{\prime }) {\bm xi}_1(\tau)^T \rbrace ({\bf A}\lbrace \bm{\beta }_0(\tau)\rbrace ^{-1})^T. \end{equation*}$$

    Proof.From Fleming and Harrington (1991) and Gómez, Julià, and Utzet (1994), we obtain

    sup t [ 0 , v ) n 1 / 2 { S ̂ R ( t ) S R ( t ) } n 1 / 2 i = 1 n S R ( t ) t y ( u ) 1 d M i R ( u ) 0 $$\begin{equation*} \sup _{t\in [0,v)}{\left\Vert n^{1/2}\lbrace \hat{S}_R(t) - S_R(t) \rbrace - n^{-1/2}\sum _{i=1}^{n}S_R(t) \int _{-\infty }^t y(u)^{-1}dM_i^R(u) \right\Vert} \rightarrow 0 \end{equation*}$$
    and
    sup t [ 0 , v ) n 1 / 2 { S ̂ L ( t ) S L ( t ) } n 1 / 2 i = 1 n { 1 S L ( t ) } × t { 1 y ( u ) } 1 d M i L ( u ) 0 . $$\begin{eqnarray*} && \sup _{t\in [0,v)} {\left\Vert n^{1/2}\lbrace \hat{S}_L(t) - S_L(t) \rbrace - n^{-1/2} \sum _{i=1}^{n}\lbrace 1-S_L(t)\rbrace \right.}\\ &&\quad \times {\left. \int _{t}^\infty \lbrace 1-y(u)\rbrace ^{-1}dM_i^L(u)\right\Vert} \rightarrow 0. \end{eqnarray*}$$
    Using similar empirical process arguments in the proof of Theorem 1, it can be easily seen that
    sup β R p , t [ 0 , v ) n 1 i = 1 n δ 1 i S R ( T i ) S L ( T i ) x i Y i ( t ) I ( T i x i T β ) R 1 ( β , t ) 0 , $$\begin{equation*} \sup _{\bm{\beta }\in \mathbb {R}^{p},\nobreakspace t\in [0,v)} {\left\Vert n^{-1} \sum _{i=1}^n \dfrac{\delta _{1i}}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} {\bf x}_iY_i(t)I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }) - R_1(\bm{\beta },t) \right\Vert} \rightarrow 0, \end{equation*}$$
    where R 1 ( β , t ) = E [ δ 1 x y ( t ) I ( T x T β ) { S R ( T ) S L ( T ) } 1 ] $ R_1(\bm{\beta },t) = E[\delta _1{\bf x}y(t)I(\tilde{T} \le {\bf x}^T\bm{\beta }) \lbrace S_R(\tilde{T}) - S_L(\tilde{T})\rbrace ^{-1}]$ , and
    sup β R p , t [ 0 , v ) n 1 i = 1 n δ 1 i S R ( T i ) S L ( T i ) x i { 1 Y i ( t ) } I ( T i x i T β ) R 2 ( β , t ) 0 , $$\begin{eqnarray*} && \sup _{\bm{\beta }\in \mathbb {R}^{p},\nobreakspace t\in [0,v)} {\left\Vert n^{-1} \sum _{i=1}^n \dfrac{\delta _{1i}}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} {\bf x}_i \right.}\\ &&\quad { \lbrace 1-Y_i(t)\rbrace I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }) - R_2(\bm{\beta },t)\Bigg \Vert} \rightarrow 0, \end{eqnarray*}$$
    where R 2 ( β , t ) = E [ δ 1 x { 1 y ( t ) } I ( T x T β ) { S R ( T ) S L ( T ) } 1 ] $ R_2(\bm{\beta },t) = E[\delta _1{\bf x}\lbrace 1-y(t)\rbrace I(\tilde{T} \le {\bf x}^T\bm{\beta }) \lbrace S_R(\tilde{T}) - S_L(\tilde{T})\rbrace ^{-1}]$ .

    Now, we use $ \approx$ for asymptotic equivalence uniformly in τ [ τ L , τ R ] $ \tau \in [\tau _L,\tau _R]$ . It follows from standard asymptotic arguments that

    U n ( β 0 , τ ) = U n S ( β 0 , τ ) + { U n ( β 0 , τ ) U n S ( β 0 , τ ) } = n 1 / 2 i = 1 n x i 1 , i ( τ ) + n 1 / 2 i = 1 n ( S R ( T i ) S ̂ R ( T i ) ) + ( S ̂ L ( T i ) S L ( T i ) ) ( S ̂ R ( T i ) S ̂ L ( T i ) ) ( S R ( T i ) S L ( T i ) ) δ 1 i x i I ( T i x i T β 0 ) n 1 / 2 i = 1 n x i 1 , i ( τ ) n 1 i = 1 n n 1 / 2 j = 1 n Y i ( u ) d M j R ( u ) / y ( u ) S R ( T i ) S L ( T i ) δ 1 i x i I ( T i x i T β 0 ) + n 1 i = 1 n n 1 / 2 j = 1 n { 1 Y i ( u ) } d M j L ( u ) / { 1 y ( u ) } S R ( T i ) S L ( T i ) × δ 1 i x i I ( T i x i T β 0 ) n 1 / 2 i = 1 n x i 1 , i ( τ ) n 1 / 2 i = 1 n 1 n j = 1 n δ 1 j x j Y j ( u ) I ( T j x j T β 0 ) S R ( T j ) S L ( T j ) d M i R ( u ) y ( u ) + n 1 / 2 i = 1 n 1 n j = 1 n δ 1 j x j { 1 Y j ( u ) } I ( T j x j T β 0 ) S R ( T j ) S L ( T j ) d M i L ( u ) 1 y ( u ) n 1 / 2 i = 1 n x i 1 , i ( τ ) n 1 / 2 i = 1 n q 1 ( β 0 , u ) d M i R ( u ) + n 1 / 2 i = 1 n q 2 ( β 0 , u ) d M i L ( u ) = n 1 / 2 i = 1 n { x i 1 , i ( τ ) + x i 2 , i ( τ ) + x i 3 , i ( τ ) } n 1 / 2 i = 1 n x i i ( τ ) , $$\begin{align*} & {\bf U}_n(\bm{\beta }_0,\tau) ={\bf U}_n^S(\bm{\beta }_0,\tau) + \lbrace {\bf U}_{n}(\bm{\beta }_0,\tau) - {\bf U}_n^S(\bm{\beta }_0,\tau)\rbrace \\ &\quad= n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i}(\tau) \\ &\quad + n^{-1/2}\sum _{i=1}^n \dfrac{ (S_R(\tilde{T}_i)-\hat{S}_R(\tilde{T}_i)) + (\hat{S}_L(\tilde{T}_i) - S_L(\tilde{T}_i)) }{(\hat{S}_R(\tilde{T}_i)-\hat{S}_L(\tilde{T}_i)) (S_R(\tilde{T}_i) - S_L(\tilde{T}_i))} \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0) \\ &\quad\approx n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i}(\tau) \\ &\quad - n^{-1}\sum _{i=1}^n \dfrac{n^{-1/2} \sum _{j=1}^{n}\int _{-\infty }^\infty Y_i(u) dM_j^R(u)/y(u)} {S_R(\tilde{T}_i)- S_L(\tilde{T}_i)} \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0)\\ &\quad + n^{-1}\sum _{i=1}^n \dfrac{n^{-1/2} \sum _{j=1}^{n}\int _{-\infty }^\infty \lbrace 1-Y_i(u)\rbrace dM_j^L(u)/ \lbrace 1-y(u)\rbrace }{S_R(\tilde{T}_i)- S_L(\tilde{T}_i)} \\ &\quad \ \ \ \times \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0)\\ &\quad\approx n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i} (\tau)\\ &\quad - n^{-1/2}\sum _{i=1}^n\int _{-\infty }^\infty {\left\lbrace \frac{1}{n} \sum _{j=1}^n\dfrac{\delta _{1j}{\bf x}_jY_j(u)I(\tilde{T}_j\le {\bf x}_j^T\bm{\beta }_0)}{S_R(\tilde{T}_j)-S_L(\tilde{T}_j)}\right\rbrace} \dfrac{dM_i^R(u)}{y(u)} \\ & \quad + n^{-1/2}\sum _{i=1}^n\int _{-\infty }^\infty {\left\lbrace \frac{1}{n} \sum _{j=1}^n\dfrac{\delta _{1j}{\bf x}_j\lbrace 1-Y_j(u)\rbrace I(\tilde{T}_j\le {\bf x}_j^T\bm{\beta }_0)}{S_R(\tilde{T}_j)-S_L(\tilde{T}_j)}\right\rbrace} \dfrac{dM_i^L(u)}{1-y(u)} \\ &\quad\approx n^{-1/2}\sum _{i=1}^{n}{\bm xi}_{1,i} (\tau) -n^{-1/2}\sum _{i=1}^{n} \int _{-\infty }^\infty q_1(\bm{\beta }_0,u) dM_i^R(u)\\ &\quad + n^{-1/2}\sum _{i=1}^{n} \int _{-\infty }^\infty q_2(\bm{\beta }_0,u) dM_i^L(u)\\ &\quad= n^{-1/2}\sum _{i=1}^{n}\lbrace {\bm xi}_{1,i}(\tau) + {\bm xi}_{2,i}(\tau) + {\bm xi}_{3,i}(\tau)\rbrace \equiv n^{-1/2}\sum _{i=1}^n{\bm xi}_i(\tau), \end{align*}$$
    where x i 2 i ( τ ) = q 1 ( u ) d M i R ( u ) $ {\bm xi}_{2i}(\tau) = \int _{-\infty }^\infty q_1(u) dM_i^R(u)$ with q 1 ( u ) = R 1 ( β 0 , u ) / y ( u ) $q_1(u) = -{R_1(\bm{\beta }_0,u)}/{y(u)}$ , x i 3 , i ( τ ) = q 2 ( u ) d M i L ( u ) $ {\bm xi}_{3,i}(\tau) = \int _{-\infty }^\infty q_2(u) dM_i^L(u)$ with q 2 ( u ) = Q 2 ( β 0 , u ) / { 1 y ( u ) } $q_2(u) = {Q_2}(\bm{\beta }_0,u)/\lbrace 1-y(u)\rbrace$ and x i i ( τ ) = x i 1 , i ( τ ) + x i 2 , i ( τ ) + x i 3 , i ( τ ) $ {\bm xi}_i(\tau) = {\bm xi}_{1,i}(\tau) + {\bm xi}_{2,i}(\tau) + {\bm xi}_{3,i}(\tau)$ .

    We claim that function classes A 1 = { x i 1 , i ( τ ) , τ [ τ L , τ R ] } $ \mathcal {A}_1 = \lbrace {\bm xi}_{1,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace $ , A 2 = { x i 2 , i ( τ ) , τ [ τ L , τ R ] } $ \mathcal {A}_2 = \lbrace {\bm xi}_{2,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace $ , and A 3 = { x i 3 , i ( τ ) , τ [ τ L , τ R ] } $ \mathcal {A}_3 = \lbrace {\bm xi}_{3,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace $ are Donsker. First, given the Lipschitz continuity of β 0 $ \bm{\beta }_0$ implied by condition (C3), we show that A 1 $ \mathcal {A}_1$ is Donsker by applying similar arguments of A $ \mathcal {A}$ and using the fact that the permanence of Donsker property in Lipschitz transformation (Theorem 2.10.6 of van der Vaart and Wellner 1996). Note that q 1 ( u ) d M i R ( u ) $ \int _{-\infty }^\infty q_1(u)dM_i^R(u)$ and q 2 ( u ) d M i L ( u ) $ \int _{-\infty }^\infty q_2(u)dM_i^L(u)$ are Lipschitz in β $ \bm{\beta }$ due to convexity in β $ \bm{\beta }$ . The Donsker property of A 2 $ \mathcal {A}_2$ and A 3 $ \mathcal {A}_3$ then follows similarly. Therefore, from the Donsker theorem (Section 2.8.2 of van der Vaart and Wellner 1996), U n ( β 0 , τ ) $ {\bf U}_n(\bm{\beta }_0,\tau)$ converges weakly to a zero-mean Gaussian process with covariance matrix Γ ( τ , τ ) = E { x i 1 ( τ ) x i 1 ( τ ) } $ \bm \Gamma (\tau ^{\prime },\tau) = E \lbrace {\bm xi}_1(\tau ^{\prime }) {\bm xi}_1(\tau)\rbrace $ .

    Finally, we can write U n ( β ̂ , τ ) U n ( β 0 , τ ) = (I) + (II) $ {\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau) = \text{(I) + (II)}$ , where

    (I) = n 1 / 2 i = 1 n δ 1 i S R ( T i ) S L ( T i ) x i { I ( T i x i T β ̂ ) I ( T i x i T β 0 ) } ; (II) = n 1 / 2 i = 1 n δ 1 i 1 S ̂ R ( T i ) S ̂ L ( T i ) 1 S R ( T i ) S L ( T i ) x i { I ( T i x i T β ̂ ) I ( T i x i T β 0 ) } . $$\begin{align*} \text{(I)} &= n^{-1/2}\sum _{i=1}^n \frac{\delta _{1i}}{ S_R(\tilde{T}_i)-S_L(\tilde{T}_i) } {\bf x}_i \lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }}) -I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0) \rbrace; \\ \text{(II)} &= n^{-1/2}\sum _{i=1}^n \delta _{1i} {\left\lbrace \frac{1}{\hat{S}_R(\tilde{T}_i) - \hat{S}_L(\tilde{T}_i)} - \dfrac{1}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} \right\rbrace} {\bf x}_i \lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }})\\&\quad -\,I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0) \rbrace . \end{align*}$$
    From Lemma A.1, the uniform consistency of β ̂ $ \hat{\bm{\beta }}$ , and the fact that E [ δ 1 i / { S R ( T i ) S L ( T i ) } ] = 1 $E[\delta _{1i}/\lbrace S_R(\tilde{T}_i)-S_L(\tilde{T}_i)\rbrace]=1$ , we observe (I) n 1 / 2 { U 0 ( β ̂ , τ ) U 0 ( β 0 , τ ) } $ \approx n^{1/2} \lbrace {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau)\rbrace $ . Note that sup i | { S ̂ R ( T i ) S ̂ L ( T i ) } 1 { S R ( T i ) S L ( T i ) } 1 | = o p ( n 1 / 2 + r ) $ \sup _i| \lbrace \hat{S}_R(\tilde{T}_i) - \hat{S}_L(\tilde{T}_i) \rbrace ^{-1} - \lbrace S_R(\tilde{T}_i)- S_L(\tilde{T}_i)\rbrace ^{-1} | = o_p(n^{-1/2 + r})$ for any 0 < r < 1 / 2 $ 0&lt;r&lt;1/2$ , and sup i x i { I ( T i x i T β ̂ ) I ( T i x i T β 0 ) } < $\sup _i\Vert {\bf x}_i\lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }}) -I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0)\rbrace \Vert &lt; \infty$ by condition (C2). The above properties and the uniform consistencies of S ̂ R ( · ) $\hat{S}_R(\cdot)$ and S ̂ L ( · ) $\hat{S}_L(\cdot)$ to S R ( · ) $S_R(\cdot)$ and S L ( · ) $S_L(\cdot)$ , respectively, imply that U n ( β ̂ , τ ) U n ( β 0 , τ ) $ {\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau)$ is dominated by (I). Taylor expansion of U 0 ( β , τ ) $ {\bf U}_0(\bm{\beta },\tau)$ around β = β 0 $ \bm{\beta }= \bm{\beta }_0$ and the uniform consistency of β ̂ $ \hat{\bm{\beta }}$ for β 0 $ \bm{\beta }_0$ give that
    U n ( β ̂ , τ ) U n ( β 0 , τ ) = { A ( β 0 ) + r n ( τ ) } n 1 / 2 ( β ̂ β 0 ) , $$\begin{equation*} {\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau) = \lbrace {\bf A}(\bm{\beta }_0) + r_n(\tau) \rbrace n^{1/2} (\hat{\bm{\beta }}- \bm{\beta }_0), \end{equation*}$$
    where sup τ r n ( τ ) 0 $ \sup _\tau \Vert r_n(\tau)\Vert \rightarrow 0$ . Given that U n ( β ̂ , τ ) = o p ( n 1 / 2 ) $ {\bf U}_n(\hat{\bm{\beta }},\tau) = o_p(n^{-1/2})$ , this further implies that n 1 / 2 ( β ̂ β 0 ) = A ( β 0 ) 1 U n ( β 0 , τ ) + r n ( τ ) $n^{1/2}(\hat{\bm{\beta }}-\bm{\beta }_0) = -{\bf A}(\bm{\beta }_0)^{-1} {\bf U}_n(\bm{\beta }_0,\tau) + r_n^*(\tau)$ , where sup τ r n ( τ ) 0 $ \sup _\tau \Vert r_n^*(\tau)\Vert \rightarrow 0$ . It then follows that
    n 1 / 2 ( β ̂ β 0 ) n 1 / 2 i = 1 n A ( β 0 ) 1 x i i ( τ ) . $$\begin{equation} n^{1/2} (\hat{\bm{\beta }}-\bm{\beta }_0)\approx n^{-1/2}\sum _{i=1}^n {\bf A}(\bm{\beta }_0)^{-1} {\bm xi}_i(\tau). \end{equation}$$ (A.3)
    Weak convergence of n 1 / 2 ( β ̂ β 0 ) $ n^{1/2}(\hat{\bm{\beta }}-\bm{\beta }_0)$ can be established, because A 1 , A 2 $ \mathcal {A}_1,\mathcal {A}_2$ , and A 3 $ \mathcal {A}_3$ are Donsker classes, and the Donsker property is preserved under addition and subtraction (Theorem 2.10.6 of van der Vaart and Wellner 1996). We have established the asymptotic results for DC data, and extending these results to PIC data is straightforward by simply adjusting the weighting scheme. By referring to (4) and (5), the transformation δ 1 i { S R ( T i ) S L ( T i ) } 1 $\delta _{1i} \lbrace S_R(\tilde{T}_i)-S_L(\tilde{T}_i) \rbrace ^{-1}$ into Δ i { F ̂ V ( T i | x i ) + S ̂ U ( T i | x i ) } 1 ${\Delta _{i}} \lbrace {\hat{F}_{V}(\tilde{T}_i|{\bf x}_i) + \hat{S}_{U}(\tilde{T}_i|{\bf x}_i)} \rbrace ^{-1}$ in Theorems A.1 and A.2 can support the asymptotic results for PIC data. $\Box$

    Open Research Badges

    Open Data

    Data Availability Statement

    The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

    This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available in the Supporting Information section.

    This article has earned an open data badge “Reproducible Research” for making publicly available the code necessary to reproduce the reported results. The results reported in this article could fully be reproduced.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.