Funding: This study was supported by National Research Foundation (NRF) of Korea, Korea University, National Research Foundation (NRF) of Korea, and the US National Institutes of Health (Grants RS-2024-00340298, K2201231, 2022M3J6A1063595, 2022R1A2C1008514, and R21DE031879, R01DE031134).

About

Sections

PDF

Tools

Share a link

Email
Wechat
Bluesky

ABSTRACT

This paper introduces a novel approach to estimating censored quantile regression using inverse probability of censoring weighted (IPCW) methodology, specifically tailored for data sets featuring partially interval-censored data. Such data sets, often encountered in HIV/AIDS and cancer biomedical research, may include doubly censored (DC) and partly interval-censored (PIC) endpoints. DC responses involve either left-censoring or right-censoring alongside some exact failure time observations, while PIC responses are subject to interval-censoring. Despite the existence of complex estimating techniques for interval-censored quantile regression, we propose a simple and intuitive IPCW-based method, easily implementable by assigning suitable inverse-probability weights to subjects with exact failure time observations. The resulting estimator exhibits asymptotic properties, such as uniform consistency and weak convergence, and we explore an augmented-IPCW (AIPCW) approach to enhance efficiency. In addition, our method can be adapted for multivariate partially interval-censored data. Simulation studies demonstrate the new procedure's strong finite-sample performance. We illustrate the practical application of our approach through an analysis of progression-free survival endpoints in a phase III clinical trial focusing on metastatic colorectal cancer.

1 Introduction

Partially interval-censored data arise in a variety of medical registries and biomedical studies, including HIV/AIDS and cancer trials, where failure times are precisely observed for some patients but interval-censored for others (Sun 2007; Bogaerts, Komárek, and Lesaffre 2017). Of particular interest in this paper, within the context of the interval-censored setup, are doubly censored (DC) data and partly interval-censored (PIC) data. In addition to a specific number of exact observations (failures/events), DC data are characterized by either left-censoring or right-censoring, while PIC data entail additional interval-censoring. In cases with DC endpoints, the exact event status can only be determined when measurements fall within a specific range. For instance, in HIV/AIDS treatment trials, the efficacy of antiretroviral therapy is often assessed using HIV-1 RNA levels, which are deemed reliable only within a certain measurement limit. Measurements outside of this range are treated as censored, resulting in a DC structure, where the censoring mechanism is administrative or possibly random. Conversely, PIC endpoints may arise when the onset of AIDS or the time to progression-free survival (PFS) in cancer studies can be precisely observed in some patients, while others experience interval-censoring due to periodic hospital visits (Gao, Zeng, and Lin 2017; Pan, Cai, and Wang 2020). In the absence of exact failure/event times, DC and PIC data are reduced to “case-1” and “case-2” interval-censored data, respectively, which have been extensively studied in the survival analysis literature. It is worth noting that the DC sampling scheme under discussion here differs from doubly-interval-censoring (DIC, Kim, De Gruttola, and Lagakos 1993), where both the originating time and the failure time are subject to interval-censoring.

As an example, we analyze a data set derived from a phase III clinical trial focusing on metastatic colorectal cancer (mCRC, Peeters et al. 2010), which was conducted between June 2006 to March 2008, involving a total of 1186 patients. Patients with mCRC underwent initial treatment based on their KRAS status and subsequently received either panitumumab plus fluorouracil, leucovorin, and irinotecan (FOLFIRI) or FOLFIRI alone as a secondary treatment, administered biweekly. Patients demonstrating disease progression at the first evaluation were considered left-censored. Subsequent instances of disease progression at later evaluations were classified as interval-censored. Patients who survived without disease progression at the end of the study were classified as right-censored. Deaths occurring during the study provided exact observation points. As a result, this data set exhibits a blend of DC and PIC data owing to the periodic administration of treatment. See Section 6 for a more detailed description and analysis of this data set.

A variety of statistical models and methods exist to conduct precise inference for partially interval-censored data. Under a single-sample scenario, a nonparametric distribution estimation was rigorously studied based on the self-consistent equation (Turnbull 1974; Gu and Zhang 1993). Hypothesis testing procedures have been developed to compare survival functions with partially or fully interval-censored data in two-sample cases (Pan 2000; Yuen, Shi, and Zhu 2006). For regression analysis, several authors considered a class of semiparametric transformation models using expectation-maximization (EM), or direct maximum likelihood (ML) estimation (Cai and Cheng 2004; Li et al. 2018; Choi and Huang 2021). This class, which includes proportional hazards (PH) and proportional odds (PO) models as special cases, enjoys a statistically efficient likelihood-based inferential framework that yields hazard-based probabilistic interpretation. Although asymptotically efficient, these ML estimation approaches are generally difficult to implement as they may require simultaneous estimation of regression parameters and the nonparametric hazard function. Another modeling approach to this problem is to use an accelerated failure time (AFT) model, which provides a direct evaluation of the association between the event time and covariates. In the context of partial interval-censoring, various methods have been proposed for statistical inference within the AFT modeling framework. These include the Buckley–James method (Choi, Kim, and Choi 2021; Gao, Zeng, and Lin 2017), M-estimation (Zhang and Li 1996; Ren and Gu 1997), kernel-based nonparametric ML estimation (Groeneboom and Hendrickx 2018), and Bayesian methods (Komárek and Lesaffre 2008).

This paper proposes a linear censored quantile regression (CQR) framework (Koenker 2005; Peng and Huang 2008; Wang and Wang 2009; Son et al. 2022) for partially interval-censored data. This approach continues to enjoy its popularity as a desired substitute for classical mean-based models in both theoretical and applied statistics. While mean-based models can solely characterize the central behavior of the data, CQR allows the analyst to investigate the dependence of the complete distributional information of the dependence of the survival time on a set of covariates. In addition to this accountability, this model can be more robust to heterogeneity, outliers, or extreme values by focusing on a couple of informative quantile levels. These attractive features have stimulated many investigators to study various right-CQR methods. Under interval-censoring, a weighted estimating equation approach can be used to fit quantile regression (QR) models (Frumento 2022; Choi et al. 2024). Several authors have proposed quantile estimation procedures for PIC data, drawing on the recursive weighting method (Cai and Cheng 2004; Lin, He, and Portnoy 2012) and martingale processes (Ji et al. 2012). However, these methods typically assume that censoring times, along with failure times, are known, a condition that is often impractical outside of administrative censoring scenarios. For instance, Lin, He, and Portnoy (2012) assumed knowledge of both failure time and censoring times $(T, L, R)$ , while Ji et al. (2012) required at least $L$ to be known. In contrast, our method is applicable even when only $\min (\max (T, L), R)$ is known, and hence more general. Moreover, one can employ an adaptive quantile loss function for the analysis of case-2 interval-censored data (Zhou, Feng, and Du 2017); however, this approach may experience a significant loss of efficiency because its implementation only utilizes partial information from the quantile order-deterministic cases. A more recent work (Yang, Narisetty, and He 2018) involves parallel estimation algorithms that use data augmentation methods from imputed latent event times to fit multiple quantile estimators.

The most simple and popular weighting scheme to adjust for right-censoring in survival analysis is the so-called inverse probability of censoring weighting (IPCW) method, which was also adapted to CQR through different versions (Bang and Tsiatis 2002; Peng and Fine 2009). In a survival or incomplete data analysis, the inverse-probability weighting method has been widely used as a simple quasi-experimental statistical approach to obtain unbiased results under observational studies. The simplicity of use and its ease of interpretation engendered considerable research in many areas, such as competing risks (Choi, Kang, and Huang 2018; Choi et al. 2022). Our strategy involves reweighting the complete-case data based on the respective probability estimates of their occurrence when interval-censoring is present. To address DC and PIC structures, our IPCW procedure entails estimating nonparametric left-censored survival functions, employing the “backward” Kaplan–Meier (KM) estimator (Gómez, Julià, and Utzet 1994). Then, we propose a weighted quantile loss function for parameter estimation, whose estimator is shown to satisfy strong consistency and asymptotic normality. For variance estimation, we use the induced smoothing technique (Chiou, Kang, and Yan 2015) that approximates the nonsmooth estimating equation with an asymptotically equivalent smooth estimating function. We further discuss an augmented-IPCW (AIPCW) estimation approach to gain more efficiency, and show that the proposed method can also be readily adapted to handle multivariate interval-censored data. We perform comprehensive simulation studies to showcase the novelty of our approach in relation to finite-sample performance. Furthermore, we demonstrate its practical utility by applying it to data obtained from a phase III clinical trial involving mCRC.

The rest of the paper is organized as follows. Section 2 introduces the statistical model, the proposed IPCW estimation procedure, along with asymptotic results of the proposed estimator, and the induced smoothing procedure for variance estimation. While Section 3 presents an augmentation-based estimator for efficiency gain Section 4 extends our framework to multivariate clustered partially interval-censored data. Sections 5 and 6 summarize our simulation study findings, and the illustration using the phase III MCC data set, respectively. Finally, discussion and concluding remarks are presented in Section 7. All technical details are relegated to the Web appendix. R codes to implement our method are available at https://github.com/yejikim1202/ipcwqrPIC.

2 Model and Estimation

2.1 Statistical Model for DC and PIC Data

Suppose that there are

n

random subjects. For the

i

th subject

(i=1,\ldots,n)

, let

T_i

be a dependent variable of interest, such as log-transformed survival time, and

{\bf x}_i

be a

p

-vector of covariates. The first element of

{\bf x}_i

is set to 1 to include the intercept term. Our main objective is to estimate the

p

-dimensional quantile coefficient vector

\bm{\beta }_0(\tau)

for some

\tau \in [\tau _L,\tau _R]\subset (0, 1)

in the following linear model:

\begin{align} T_i = {\bf x}_i^T \bm{\beta }_0(\tau) + e_i(\tau),\quad i=1, \ldots,n, \end{align}

(1)

where

e_i(\tau)

is the random error whose

\tau

th quantile conditional on

{\bf x}_i

equals 0. If the quantile assumption on

e_i(\tau)

is replaced by

E[e_i(\tau)]=0

and log-transformed survival time is used, model (1) corresponds to the familiar AFT model (Chiou, Kang, and Yan 2015). The

\tau

th conditional quantile function of

T_i

given

{\bf x}_i

is defined as

Q_{T}(\tau |{\bf x}_i) = \inf \lbrace t: F(t|{\bf x}_i) \ge \tau \rbrace

, where

F(\cdot |{\bf x}_i)

is the cumulative distribution function of

T_i

conditional on

{\bf x}_i

. Correspondingly, model (1) amounts to assuming

\begin{align} Q_{T}(\tau | {\bf x}_i) = {\bf x}_i^T \bm{\beta }_0(\tau), \end{align}

(2)

which suggests a new estimation strategy that differs from conventional mean-based approaches to analyzing survival data. Unlike the traditional Cox PH and AFT models, the CQR model (1) relaxes the proportionality constraint on the hazard, and allows for modeling data heterogeneity by evaluating the covariate effects at any level of

\tau

. In this paper, we are primarily interested in the CQR modeling of (i) DC, and (ii) PIC data, which can be formulated as follows.

DC Data:

DC data arise when random censoring can occur from either the left or right side, alongside exact observations. Let $(T_i, L_i, R_i)$ denote a tuple of exact failure time, left-censoring, and right-censoring variables, respectively, with $P(L_i \le R_i)=1$ . Under the DC structure, we can only observe $\lbrace (\tilde{T}_i, \delta _i, {\bf x}_i), i=1, \ldots,n\rbrace$ , where $\tilde{T}_i = (T_i \wedge R_i) \vee L_i$ is the observed failure time and $\delta _i = (\delta _{1i}, \delta _{2i}, \delta _{3i})$ is the censoring indicator with $\delta _{1i}=I(L_i\le T_i\le R_i)$ , $\delta _{2i}=I(T_i> R_i)$ , and $\delta _{3i}=1-\delta _{1i}-\delta _{2i}$ . Here, we use $a \wedge b = \min (a,b)$ and $a \vee b = \max (a,b)$ . Notice that $T_i$ is observable only when $T_i\in (L_i,R_i)$ , that is, $\delta _{1i}=1$ , otherwise right-censored $(\delta _{2i}=1)$ or left-censored $(\delta _{3i}=1)$ . Due to the fact that $\delta _{1i} + \delta _{2i} + \delta _{3i}=1$ , three censoring statuses should be disjoint. If $\delta _{1i}\equiv 0$ for all subjects (i.e., without any exact observations), DC data reduce to so-called “current status” or “case-1” interval-censored data (Groeneboom and Hendrickx 2018), in which all subjects are either left- or right-censored.

PIC) Data:

Unlike DC data, PIC data are a mixture of exact and interval-censored failure times, where $T_i$ can be observed only when it is not interval-censored. Suppose that $(U_i,V_i)$ is the tightest interval that might contain $T_i$ , that is, $T_i\in (U_i,V_i)$ if it is interval-censored. Let $\Delta _i$ be the censoring indicator that takes 1 when $T_i$ is observed, and 0, otherwise. The PIC data can be represented as $\lbrace (\Delta _i, \Delta _i T_i, (1-\Delta _i)U_i, (1-\Delta _i)V_i, {\bf x}_i), \nobreakspace i=1, \ldots,n\rbrace$ . It can also be summarized as $\lbrace (\tilde{U}_i, \tilde{V}_i, \Delta _i, {\bf x}_i), \nobreakspace i=1, \ldots,n\rbrace$ , where $\tilde{U}_{i} = T_i \wedge U_i = \Delta _i T_i + (1-\Delta _i) U_i$ and $\tilde{V}_i = T_i \vee V_i = \Delta _i T_i + (1-\Delta _i) V_i$ . Hence, $T_i$ can be right-censored at $U_i$ , and left-censored at $V_i$ . When $\Delta _{i}\equiv 0$ for all subjects, PIC data reduce to the conventional case-2 interval-censored data.

Remark 1.Note that DC data can be translated to PIC data and vice versa. For example, when $T$ is left-censored at $L$ or right-censored at $R$ , it implies $T \in (U, V) \equiv (-\infty, L)$ , or $T \in (U, V) \equiv (R, \infty)$ , respectively. Therefore, left- and right-censoring can also be seen as interval-censoring, if we allow $U=-\infty$ and $V=\infty$ . Conversely, PIC data can be taken as DC because $U$ and $V$ can be right- and left-censored by $T$ . In fact, DC complements PIC in the sense that, under DC, $T$ is observable only when $T$ falls in some interval, while, under PIC, $T$ is observable only when $T$ lies outside some interval. Due to this similarity, a unified estimation approach is applicable for analyzing both DC and PIC data.

Remark 2.Throughout the paper, it is assumed that the visit process $\lbrace W_k\rbrace$ that generates the censoring structure is independent of $T$ given ${\bf x}$ . To be specific, let us denote a sequence of examination times by $0 < W_{1} < \ldots < W_{K} < \infty$ that gives rise to the interval $(U,V)$ for PIC data, where $U = \max _k\lbrace W_{k}: W_{k} \le T \rbrace$ and $V = \min _k\lbrace W_{k}: W_{k} \ge T\rbrace$ . Therefore, the choice of $(U,V)$ depends on $T$ , although the joint distribution of $(W_{1},\ldots,W_{K})$ is independent of $T$ . Conversely, it followed that $L=\min _k\lbrace W_k:W_k\ge T\rbrace$ and $R=\max _k\lbrace W_k:W_k\le T\rbrace$ for DC data. We assume that the proportion of obtaining exact observations is not negligible, and the joint distribution of $(W_{1},\ldots,W_{K})$ is independent of $T$ given ${\bf x}$ for censored subjects. We shall express this independence situation as $(L,R)\perp \!\!\!\perp T|{\bf x}$ for DC data and $(U,V)\perp \!\!\!\perp T|{\bf x}$ for PIC data. This implies that the paired censoring variables do not provide any additional information regarding the distribution of $T$ , other than the fact that it is bracketed (Zhang and Heitjan 2006).

2.2 Estimation

Without censoring, one may directly apply the standard estimating technique for QR, which locates

\bm{\beta }_0(\tau)

as the minimizer of

n^{-1} \sum _{i=1}^n \rho _{\tau } (T_{i} - {\bf x}_i^T \bm{\beta })

, where

\rho _{\tau }(u) = u \lbrace \tau - I(u \le 0) \rbrace

is the check loss function, or equivalently the solution to the estimating equation

\begin{align} n^{-1/2} \sum _{i=1}^n {\bf x}_i \lbrace I(T_i -{\bf x}_i^T \bm{\beta }\le 0) - \tau \rbrace \approx 0. \end{align}

(3)

To handle the complex interval-censoring problem, we propose to modify the estimating Equation (3) by using an IPCW technique. For the DC data type, let

S_R(t|{\bf x})=P(R\ge t|{\bf x})

and

S_L(t|{\bf x})=P(L\ge t|{\bf x})

be the survival function of the right-, and left-censoring variables,

R

and

L

, respectively, given

{\bf x}

, and

\hat{S}_R(t|{\bf x})

and

\hat{S}_L(t|{\bf x})

denote their consistent estimates. Similarly, we can define

S_U(t|{\bf x})

S_V(t|{\bf x})

F_V(t|{\bf x}) = 1 - S_V(t|{\bf x})

, and their estimates for PIC data. Under DC or PIC, we propose to solve the following IPCW estimating function:

\begin{align} {\bf U}_{n}(\bm{\beta },\tau) = n^{-1/2} \sum _{i=1}^n {\bf x}_i \lbrace \hat{w}_i I(\tilde{T}_i - {\bf x}_i^T \bm{\beta }\le 0) - \tau \rbrace \approx 0, \end{align}

(4)

where

\begin{align} \hat{w}_i = \displaystyle {\begin{cases} \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i|{\bf x}_i) - \hat{S}_{L}(\tilde{T}_i|{\bf x}_i)}, & \mbox{for DC data,} \\[6pt] \dfrac{\Delta _{i}}{\hat{F}_{V}(\tilde{T}_i|{\bf x}_i) + \hat{S}_{U}(\tilde{T}_i|{\bf x}_i)}, & \mbox{for PIC data.} \end{cases}} \end{align}

(5)

For PIC data, we may define

\tilde{T}_i=\tilde{U}_i

\tilde{T}_i=\tilde{V}_i

, since calculation of

\hat{w}_i

is needed only when

\Delta _i=1

, for which

\tilde{U}_i=\tilde{V}_i=T_i

Unbiasedness of the weighting schemes in (5) follows easily using a conditioning argument. For DC data, we have

\begin{eqnarray*} && E {\left[ \dfrac{I(\tilde{T}\le t,\delta _1=1)}{S_R(\tilde{T}|{\bf x})-S_L(\tilde{T}|{\bf x})} \bigg | {\bf x}\right]}\\ &&\quad = E {\left[ E{\left\lbrace \dfrac{ I(T\le t, L&lt; T&lt;R)}{S_R(T|{\bf x})-S_L(T|{\bf x})} \bigg |T,{\bf x}\right\rbrace} \bigg |{\bf x}\right]} \\ &&\quad = E {\left[ \dfrac{ I(T\le t)\lbrace S_R(T|{\bf x})-S_L(T|{\bf x})\rbrace }{S_R(T|{\bf x})-S_L(T|{\bf x})} \bigg |{\bf x}\right]} = P(T \le t |{\bf x}) \end{eqnarray*}

under the independent assumption between

T_i

and

(L_i,R_i)

given

{\bf x}

. Similarly, for PIC data, it can be seen that

\begin{align*} & E {\left[ \dfrac{I(\tilde{T}\le t,\Delta =1)}{F_{V}(\tilde{T}|{\bf x})+ S_{U}(\tilde{T}|{\bf x}) } \bigg | {\bf x}\right]} \\ & = E{\left[ E {\left\lbrace \dfrac{I(T\le t)\lbrace I(T\le U)+I(T&gt;V)\rbrace }{1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x}) } \bigg |T,{\bf x}\right\rbrace} \bigg | {\bf x}\right]}\\ &= E{\left[ \dfrac{ I(T\le t)\lbrace 1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x})\rbrace }{1- S_{V}(T|{\bf x})+ S_{U}(T|{\bf x}) } \bigg | {\bf x}\right]} = P(T \le t | {\bf x}). \end{align*}

Although Equation (4) is monotone, the exact zero-crossing of

{\bf U}_n(\bm{\beta },\tau)

usually does not exist. Instead, it is equivalent to the gradient of the

l_1

-type convex function (Peng and Fine 2009)

\begin{eqnarray} Q_{n}(\bm{\beta },\tau) &=& \sum _{i=1}^n \hat{w}_i \big\vert \tilde{T}_i- {\bf x}_i^T \bm{\beta }\big\vert + \Bigg\vert M^* - \sum _{j=1}^n (-\hat{w}_j{\bf x}_j)^T \bm{\beta }\Bigg\vert \nonumber\\ && +\, \Bigg\vert M^*-(2\tau) \sum _{k=1}^n {\bf x}_k^T \bm{\beta }\Bigg\vert , \end{eqnarray}

(6)

where

M^*&gt;0

is a sufficiently large value that bounds both

|\sum _{j=1}^n (-\hat{w}_j{\bf x}_j)^T\bm{\beta }|

and

|(2\tau)\sum _{k=1}^n {\bf x}_k^T\bm{\beta }|

from above for any

\bm{\beta }

in the compact parameter space

\mathbb {B}

for

\bm{\beta }_0(\tau)

. Minimization of (6) can be easily implemented using standard software for

l_1

-type optimization, or the rq() function in the R package quantreg (Koenker 2005). Therefore, we define the proposed IPCW estimator as

\hat{\bm{\beta }}(\tau) = \arg \min _{\bm{\beta }\in \mathbb {B}} Q_{n}(\bm{\beta },\tau)

In the rest of the paper, we will focus on DC data for ease of presentation, since almost similar techniques can be employed to analyze PIC endpoints. To solve Equation (4) with DC data, we need

\hat{S}_R(t|{\bf x})

and

\hat{S}_L(t|{\bf x})

, some reasonable estimates of

S_R(t|{\bf x})

and

S_L(t|{\bf x})

, which can be obtained via various methods. For example, if the censoring mechanism depends on a set of discrete covariates, they can be estimated nonparametrically within each data stratum defined by the values of these discrete covariates. In the case that the underlying censoring mechanism involves continuous covariates, we might assume some parametric or semiparametric methods, such as Cox models. See Remark 3 in the following for available nonparametric approaches. In the sequel, we assume (for simplicity) that the unconditional independence between

T

and

(L_i,R_i)

, such that

\hat{S}_R(t|{\bf x})

and

\hat{S}_L(t|{\bf x})

may be replaced by simple KM-type estimators,

\hat{S}_R(t)

and

\hat{S}_L(t)

, respectively. Then, the IPCW estimating Equation (4) for DC data is given by

\begin{align} {\bf U}_{n}(\bm{\beta },\tau) = n^{-1/2} \sum _{i=1}^n {\bf x}_i {\left\lbrace \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i)-\hat{S}_{L}(\tilde{T}_i)} I(\tilde{T}_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \!\!\right\rbrace} \approx 0. \end{align}

(7)

The calculation of the KM estimator for the right-censored survival function is straightforward, that is, $\hat{S}_R(t) = \prod _{u<t} \lbrace 1 - dN^R(u)/Y(u)\rbrace$ , where $N^R(u) = \sum _{i=1}^n N_i^R(u) = \sum _{i=1}^n I(\tilde{T}_i\le u,\delta _{2i}=1)$ and $Y(u) = \sum _{i=1}^n Y_i(u) = \sum _{i=1}^n I(\tilde{T}_i\ge u)$ . However, the nonparametric estimation of the left-censored survivor function is not so simple, and a number of approaches have been proposed (Gómez, Julià, and Utzet 1994). The most cited and intuitive approach is to use the “backward” KM estimator, that is, transform left-censored data into right-censored data by multiplying each datum by $-1$ , and then using the KM method. On the original scale, the estimator of $S_L(t)$ is then given by $\hat{S}_L(t)=1-\hat{S}_{\it KM}(-t)$ , where, $\hat{S}_{\it KM}(-t)$ denotes a KM estimate based on the left-censored data multiplied by $-1$ . More specifically, $\hat{S}_L(t) = 1-\prod _{u>t}\lbrace 1-dN^L(u)/(n+1-Y(u))\rbrace$ , where $N^L(u) = \sum _{i=1}^n N_i^L(u) = \sum _{i=1}^n I(\tilde{T}_i\ge u,\delta _{3i}=1)$ . Notice that the conventional KM method can be used to estimate $S_V(t)$ with PIC data, whereas, the backward KM method should be applied for $S_U(t)$ .

Remark 3.In the case that the visit process generating $(L_i,R_i)$ is independent of $T_i$ given ${\bf x}_i$ , one might use Beran's local KM estimator (Beran 1981), that is,

\begin{align} \hat{S}_R(t|{\bf x}) = \prod _{j=1}^n {\left\lbrace 1-\dfrac{B_{nj}({\bf x})}{\sum _{k=1}^n I(\tilde{T}_k \ge \tilde{T}_j)B_{nk}({\bf x})} \right\rbrace} ^{I(\tilde{T}_j \le t,\delta _{2j}=1)} \end{align}

(8)

and

\begin{align*} \hat{S}_L(t|{\bf x}) = 1-\prod _{j=1}^n {\left\lbrace 1-\dfrac{B_{nj}({\bf x})}{\sum _{k=1}^n I(\tilde{T}_k \le \tilde{T}_j)B_{nk}({\bf x})} \right\rbrace} ^{I(\tilde{T}_j\ge t,\delta _{3j}=1)}, \end{align*}

where

B_{nj}({\bf x})

is a sequence of nonnegative weights adding up to 1. For example, we can employ the commonly used Nadaraya–Watson-type weight, that is,

B_{nj}(x) = K \left(\frac{{\bf x}-{\bf x}_j}{h_n} \right) /\sum _{k=1}^n K \left(\frac{ {\bf x}- {\bf x}_k}{h_n} \right)

, where,

K(\cdot)

is a kernel density function and

h_n \in \mathbb {R}^+

is the bandwidth converging to zero as

n \rightarrow \infty

. By plugging these local KM estimators into the estimating function (7), we can obtain a nonparametric covariate-adjusted IPCW estimator. Another viable alternative is to employ random forest approaches for nonparametric survival prediction (Ishwaran et al. 2008). This recursive partitioning method is effective, computationally feasible, and accommodates the dependence of covariates on censoring, even in higher dimensions.

2.3 Asymptotic Results

This section provides asymptotic results of the proposed IPCW estimator for DC endpoints. Denote the Euclidean norm by

\Vert \cdot \Vert

, and let

{\bf a}^{\otimes 2} = {\bf aa}^T

for a vector

{\bf a}

. We first impose the following regularity conditions:

(C1) The joint distribution function of $(L, R)$ is continuous. There exists $u\in (0,\infty)$ , such that $P(R-L > u |{\bf x}) =1$ . There also exist $-\infty < v_1 \le v_2 \le v<\infty$ such that $P(v_1 < L\le v_2|{\bf x}) = 1$ and $P(R\le v | {\bf x}) = 1$ .
(C2) The covariate ${\bf x}$ is uniformly bounded, that is, $\sup _i \Vert {\bf x}_i \Vert <\infty$ .
(C3) (i) The quantile coefficient $\bm{\beta }_0(\tau)$ is Lipschitz continuous for $\tau \in [\tau _L, \tau _R] \subset (0,1)$ ; (ii) $f(t|{\bf x})$ is bounded above uniformly in $t$ and ${\bf x}$ , where $f(t|{\bf x}) = dF(t|{\bf x})/dt$ .
(C4) For some $\rho _0>0$ and $c_0 > 0$ , $\inf _{\bm{\beta }\in \mathbb {B}(\rho _0)}\text{eigmin} \, {\bf A}\lbrace \bm{\beta }(\tau)\rbrace \ge c_0$ , where $\mathbb {B}(\rho) = \lbrace \bm{\beta }\in \mathbb {R}^{p}:\inf _{\tau \in [\tau _L,\tau _R]} \Vert \bm{\beta }(\tau)-\bm{\beta }_0(\tau)\Vert \le \rho \rbrace$ and ${\bf A}\lbrace \bm{\beta }(\tau)\rbrace = E[{\bf x}^{\otimes 2} f({\bf x}^T\bm{\beta }|{\bf x})]$ . Here, eigmin $(\cdot)$ denotes the minimum eigenvalue of a matrix.

Note that condition (C1) simplifies theoretical arguments and is satisfied in many clinical settings with administrative censoring. Conditions (C2) and (C3) are typical assumptions in many QR methods for the boundedness of covariates, the smoothness of coefficient processes, and the uniform boundedness of the density function $f(\cdot)$ . Condition (C4) should be imposed, such that the asymptotic limit of $Q_{n}(\bm{\beta },\tau)$ is strictly convex in a neighborhood of $\bm{\beta }_0(\tau)$ for $\tau \in [\tau _L,\tau _R]$ . This condition implies that ${\bf U}_{n}(\bm{\beta },\tau)$ at any $\bm{\beta }(\tau)$ other than $\bm{\beta }_0(\tau)$ is far from its minimum as $n$ goes infinity. Thus, this contains not only the identifiability of $\bm{\beta }_0(\tau)$ , but the consistency of $\hat{\bm{\beta }}(\tau)$ . In addition, it should also be noted that condition (C4) holds, when $E({\bf x}^{\otimes 2})$ is positive-definite and $\inf _{\bm{\beta }\in \mathbb {B}(\rho _0),{\bf x}}f({\bf x}^T\bm{\beta }|{\bf x})$ is bounded below by a positive constant. We then claim the consistency of $\hat{\bm{\beta }}(\tau)$ in Theorem 1.

Theorem 1.Under regularity conditions (C1)–(C4), $\lim _{n\rightarrow \infty } \sup _{\tau \in [\tau _L,\tau _R]} \Vert \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\Vert \rightarrow _p 0$ , assuming model (2) holds for $\tau \in [\tau _L,\tau _R]$ .

To study the asymptotic normality properties of the proposed estimators. we use the counting process and associated martingale theory (Fleming and Harrington 1991). Based on the natural filtration $\mathcal {F}^R_t=\sigma \lbrace N^R_i(u),Y_i(u); u\le t, i=1, \ldots,n\rbrace$ , we define $M_i^R(t)= N_i^R(t)- \int _{-\infty }^t Y_i(u)\lambda ^R(u)du$ , where $\lambda ^R(t)=\lim _{h\rightarrow 0}P(u\le R< u+h|R\ge u)/h$ . Likewise, we define the reversed filtration $\mathcal {F}^L_t=\sigma \lbrace N^L_i(u),Y_i(u); u\ge t, i=1, \ldots,n\rbrace$ and the martingale process $M_i^L(t)= N_i^L(t)- \int _t^\infty (1-Y_i(u)) \lambda ^L(u)du$ , where $\lambda ^L(t)=\lim _{h\rightarrow 0}P(u-h<L\le u|L\le u)/h$ . The definitions of $\mathcal {F}^L_t$ and $\lambda ^L(t)$ are somewhat hypothetical due to their dependence on future information, but standard martingale theory may also be used by reading the data backward in time. The following theorem states the asymptotic normality of $\hat{\bm{\beta }}(\tau)$ .

Theorem 2.Under regularity conditions (C1)–(C4), $n^{1/2} \lbrace \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\rbrace$ weakly converges to zero-mean Gaussian process for $\tau,\tau ^{\prime }\in [\tau _L,\tau _R]$ , with covariance

\begin{equation*} \Psi (\tau,\tau ^{\prime }) = {\bf A}\lbrace \bm{\beta }_0(\tau)\rbrace ^{-1} E\lbrace {\bm xi}_1(\tau) {\bm xi}_1(\tau ^{\prime })^T \rbrace ({\bf A}\lbrace \bm{\beta }_0(\tau ^{\prime })\rbrace ^{-1})^T, \end{equation*}

where the expression for

{\bm xi}_i(\tau)

is given in the Appendix.

Detailed proofs of Theorems 1 and 2 and associated lemma are relegated to the Appendix.

2.4 Variance Estimation via Induced Smoothing

For variance estimation, we may use an induced smoothing approach (Chiou, Kang, and Yan 2015; Choi, Kang, and Huang 2018) by approximating the nonsmoothed estimating equation in (7) with an asymptotically equivalent smoothed function. Let

{\bf z}

be an

N(0,I_p)

random vector independent of the data, where

I_p

denotes the

p\times p

identity matrix, and

\Sigma

be a

p\times p

symmetric, positive-definite matrix with

\Vert \Sigma \Vert =O(n^{-1})

. Let

\Phi (\cdot)

and

\phi (\cdot)

be the cumulative distribution and density function of the standard multivariate normal variable

{\bf z}

. Since

\hat{\bm{\beta }}\approx \bm{\beta }_0+\Sigma ^{1/2} {\bf z}

with

\Sigma =n^{-1}\Psi

, which is implied by Theorem 2, we can approximate

{\bf U}_n(\bm{\beta },\tau)

\tilde{{\bf U}}_n(\bm{\beta },\Sigma,\tau)=E_Z[{\bf U}_n(\bm{\beta }+\Sigma ^{1/2}{\bf z},\tau)]

, which gives

\begin{equation} \tilde{{\bf U}}_{n}(\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^{n} {\bf x}_i \Bigg \lbrace \hat{w}_i \Phi {\left(-\dfrac{T_i - {\bf x}_i^T\bm{\beta }}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}} \right)} - \tau \Bigg \rbrace . \end{equation}

(9)

The partial derivative of this formulation can be explicitly expressed as

\begin{equation*} \tilde{A}_n(\bm{\beta },\Sigma,\tau) = \dfrac{\partial \tilde{{\bf U}}_n(\bm{\beta },\Sigma,\tau)}{\partial \bm{\beta }} \approx n^{-1} \sum _{i=1}^n \phi {\left(-\dfrac{T_i - {\bf x}_i^T\bm{\beta }}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}} \right)} \dfrac{{\bf x}_i{\bf x}_i^T}{\sqrt {{\bf x}_i^T\Sigma {\bf x}_i}}. \end{equation*}

Moreover,

\Gamma (\tau)\equiv \lim _{n\rightarrow \infty }\text{var}\lbrace n^{1/2}{\bf U}_n(\bm{\beta }_0,\tau)\rbrace

can be approximated by

\begin{align*} \hat{\Gamma }_n(\hat{\bm{\beta }},\tau) & = n^{-1} \sum _{i=1}^n {\bf x}_i^{\otimes 2} {\left[\hat{w}_i I(\tilde{T}_i\le {\bf x}_i^T\hat{\bm{\beta }}) -\tau \right]}^2 \\ &\quad -\,n^{-1}\int _{-\infty }^\infty \frac{dN^R(u)}{Y^2(u)} \lbrace \hat{B}^R(\hat{\bm{\beta }},u)\rbrace ^{\otimes 2}\\ &\quad +\, n^{-1}\int _{-\infty }^\infty \frac{dN^L(u)}{(n+1-Y(u))^2} \lbrace \hat{B}^L(\hat{\bm{\beta }},u)\rbrace ^{\otimes 2}, \end{align*}

where

\hat{B}^R(\bm{\beta },u) = \sum _{i=1}^n \hat{w}_i Y_i(u) {\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })

and

\hat{B}^L(\bm{\beta },u) = \sum _{i=1}^n \hat{w}_{i} \lbrace 1-Y_i(u)\rbrace {\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })

. The inferential procedure with induced smoothing proceeds iteratively as follows:

Step 1. Let $\tilde{\bm{\beta }}_{(0)}=\hat{\bm{\beta }}$ and $\tilde{\Sigma }_{(0)} = n^{-1} I_p$ .
Step 2. Given $\tilde{\bm{\beta }}_{(k-1)}$ and $\tilde{\Sigma }_{(k-1)}$ from the $(k-1)$ th step, update $\tilde{\bm{\beta }}_{(k)}$ and $\tilde{\Sigma }_{(k)}$ as
$\begin{eqnarray*} && \tilde{\bm{\beta }}_{(k)} \leftarrow \nobreakspace \tilde{\bm{\beta }}_{(k-1)} - \lbrace \tilde{A}_n (\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1} \tilde{{\bf U}}_n\\ && \times \ (\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau), \\ && \tilde{\Sigma }_{(k)} \leftarrow \nobreakspace n^{-1}\lbrace \tilde{A}_n(\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1} \hat{\Gamma }_n (\tilde{\bm{\beta }}_{(k-1)},\tau)\\ && \times \ \lbrace \tilde{A}_n(\tilde{\bm{\beta }}_{(k-1)},\tilde{\Sigma }_{(k-1)},\tau)\rbrace ^{-1}. \end{eqnarray*}$
Step 3. $k\leftarrow k+1$ and repeat Step 2 until convergence.

Let $\tilde{\bm{\beta }}$ and $\tilde{\Sigma }$ denote the smoothed estimators at convergence. Note that the variance estimator $\tilde{\Psi }=n\tilde{\Sigma }$ is obtained as a byproduct while performing this iterative procedure. Since $\tilde{\Psi }$ is consistent for $\Psi$ and $\Vert \hat{\bm{\beta }}-\tilde{\bm{\beta }}\Vert =O_p(n^{-1/2})$ , $\tilde{\Psi }$ may be used as a variance estimator for $\hat{\bm{\beta }}$ . In practice, the induced smoothing procedure converges very quickly, and our simulation results confirm that variance estimates are fairly accurate and stable.

3 Augmentation-Based Estimation

The proposed IPCW estimator is generally statistically inefficient because Equation (7) involves only expressions with noncensored data. The only information obtained from the censored observations is in estimating

S_L

and

S_R

. This section suggests an AIPCW estimator based on both censored and uncensored data. To implement the AIPCW approach, we need two procedures, (i) positing a working statistical model for

(T,{\bf x})

and (ii) defining two expectations,

Q_1(t, \bm{\beta }, H_i) = E [ {\bf x}_i \lbrace I (T_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \rbrace \nobreakspace | \nobreakspace T_i \ge t, H_i]

and

Q_2(t, \bm{\beta }, H_i) = E [ {\bf x}_i \lbrace I (T_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \rbrace \nobreakspace | \nobreakspace T_i \le t, H_i]

, where

H_i=(\tilde{T}_i, \delta _i, {\bf x}_i)

denotes the observed data. Let

\lbrace p(h;\psi); \psi \in \mathbb {R}^q\rbrace

be a posited model for the distribution of

H_i

, and let

\hat{\psi }

be the ML estimator for this model, with

\psi ^*

as its limit, satisfying

n^{1/2}(\hat{\psi }-\psi ^*)=O_p(1)

. Then, following Robins and Rotnitzky (1992), we define the AIPCW estimator

\hat{\bm{\beta }}^*(\tau)

as the solution to the augmented estimating equation

\begin{equation} \begin{split} {\bf U}^{*}_n(\bm{\beta },\tau) =\, & n^{-1} \sum _{i=1}^n {\left[ {\bf x}_i {\left\lbrace \dfrac{ \delta _{1i}}{\hat{S}_{R}(\tilde{T}_i)-\hat{S}_{L}(\tilde{T}_i)} I (\tilde{T}_i -{\bf x}_i^T\bm{\beta }\le 0)-\tau \right\rbrace} \right.} \\ & + \int _{-\infty }^{\infty } \hat{Q}_1 (t, \hat{\bm{\beta }}, \hat{\psi }, H_i) \dfrac{d\hat{M}^R_{i}(t)}{ \hat{S}_{R}(t)} \\ &{\left. + \int _{-\infty }^{\infty } \hat{Q}_2 (t, \hat{\bm{\beta }}, \hat{\psi }, H_i) \dfrac{d\hat{M}^L_{i}(t)}{ \hat{S}_L(t)} \right]}, \end{split} \end{equation}

(10)

where

\hat{M}_i^R(t)= N_i^R(t)- \int _{-\infty }^t Y_i(u)\hat{\lambda }^R(u)du

and

\hat{M}_i^L(t)= N_i^L(t)- \int _t^\infty (1-Y_i(u)) \hat{\lambda }^L(u)du

with

\hat{\lambda }^R(t)=dN^R(t)/Y(u)

and

\hat{\lambda }^L(t)=dN^L(t)/(n+1-Y(u))

. Here,

\hat{Q}_k(t,\bm{\beta },\psi,H_i)

denotes a working model for

Q_k(t,\bm{\beta },H_i)

k=1,2

The advantages of the AIPCW estimator are generally twofold. First, the estimator is consistent (see proof of Theorem 3, presented in Section B of the Supporting Information), when either the censoring distribution does not depend on the covariates, or the posited model for $(T,{\bf x})$ is correct. For this reason, this estimator is often referred to as a doubly-robust (DR) estimator. Second, when the aforementioned conditions for double robustness are met, the AIPCW estimator $\hat{\bm{\beta }}^*(\tau)$ can have a smaller asymptotic variance than $\hat{\bm{\beta }}(\tau)$ . In order to solve the above augmented estimating equation, we may use the dfsane() function in the R package BB (Varadhan and Gilbert 2010), which is a derivative-free spectral solver for nonlinear systems of equations. To precisely estimate the standard errors, bootstrapping would be the method of choice, which turns out to be computationally expensive for the AIPCW estimator. As earlier, we employ the induced smoothing method for statistical inference.

4 Extension to Multivariate DC Data

We further extend our CQR method to multivariate clustered DC data. Suppose that there are $n$ clusters with the $i$ th cluster having $c_i$ members, and that the $k$ th subject of the $i$ th cluster $(k=1,2,\ldots,c_i$ , $i=1,2,\ldots,n)$ can distinctly experience an event of interest, that is subject to double-censoring. It is assumed that $c_i$ is relatively small compared to $n$ . For the $k$ th member of the $i$ th cluster, let $(T_{ik},L_{ik},R_{ik})$ be the failure, left-censoring and right-censoring time variables in order, and ${\bf x}_{ik}$ is the corresponding $p$ -vector of covariates. As before, it is assumed that the visit process that generates $(L_{ik},R_{ik})$ is independent of $T_{ik}$ and ${\bf x}_{ik}$ . The observed data consist of $\lbrace (\tilde{T}_{ik},\delta _{ik},{\bf x}_{ik}),k=1,\ldots,c_i;i=1,\ldots,n\rbrace$ , where, $\tilde{T}_{ik}=(T_{ik}\wedge R_{ik})\vee L_{ik}$ and $\delta _{ik}=(\delta _{1ik},\delta _{2ik},\delta _{3ik})$ is the censoring indicator with $\delta _{1ik}=I(L_{ik}\le T_{ik}\le R_{ik})$ , $\delta _{2ik}=I(T_{ik}> R_{ik})$ , and $\delta _{3ik}=1-\delta _{1ik}-\delta _{2ik}$ .

Suppose that the marginal regression model satisfies

\begin{equation} T_{ik}={\bf x}_{ik}^T\bm{\beta }(\tau)+e_{ik}(\tau),\nobreakspace \nobreakspace k=1,\ldots,c_i, i=1,\ldots,n, \end{equation}

(11)

where

\bm{\beta }(\tau)

is a

p

-vector of unknown regression parameters common to all

n

clusters. Under the working independence assumption, we may obtain the estimator

\hat{\bm{\beta }}(\tau)

for

\bm{\beta }(\tau)

by solving the following weighted estimating function:

\begin{align} {\bf U}_{n}^\dagger (\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik} {\left\lbrace \hat{w}_{ik} I(\tilde{T}_{ik} -{\bf x}_{ik}^T\bm{\beta }\le 0)-\tau \right\rbrace}, \end{align}

(12)

where

\hat{w}_{ik}= \delta _{1ik}/ \lbrace \tilde{S}_{R}(\tilde{T}_{ik})-\tilde{S}_{L}(\tilde{T}_{ik})\rbrace

and

\eta _i

is a known weight to calibrate for the possible informativeness of cluster sizes (Cong, Yin, and Shen 2007; Wang and Zhao 2008).

For the marginal analysis of clustered survival data, we conventionally use $\eta _i=1$ , which tends to overweight the large clusters because each individual observation contributes equally to the estimating equation. When cluster sizes are informative to the outcome of interest, we can incorporate the inverse of cluster sizes as a weight in the estimating function, letting, for example, $\eta _i=1/c_i^\alpha$ for some $0\le \alpha \le 1$ , which is also known to improve the efficiency of the resulting estimator (Wang and Zhao 2008). By assuming common censoring distributions independent of covariates, we may put together data across clusters and use the KM method to estimate $S_R(t)=P(R_{ik}\ge t)$ and $S_L(t)=P(L_{ik}\ge t)$ because finite cluster sizes preclude consistent estimation of the censoring distributions. We estimate $S_R(t)$ with $\tilde{S}_R(t)=\prod _{u<t}\lbrace 1-d\tilde{N}^R(u)/\tilde{Y}(u)\rbrace$ , where $\tilde{N}^R(u)=\sum _{i=1}^n\eta _i \sum _{k=1}^{c_i}\tilde{N}_{ik}^R(u)=\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _iI(\tilde{T}_{ik}\le u,\delta _{2ik}=1)$ and $\tilde{Y}(u)=\sum _{i=1}^n \eta _i \sum _{k=1}^{c_i}\tilde{Y}_{ik}(u) =\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _i I(\tilde{T}_{ik}\ge u)$ , and $S_L(t)$ with $\tilde{S}_L(t)=1-\prod _{u>t}\lbrace 1-d\tilde{N}^L(u)/(N+1-\tilde{Y}(u))\rbrace$ , where $\tilde{N}^L(u)=\sum _{i=1}^n\eta _i \sum _{k=1}^{c_i}\tilde{N}_{ik}^L(u)=\sum _{i=1}^n \sum _{k=1}^{c_i} \eta _iI(\tilde{T}_{ik}\ge u,\delta _{3ik}=1)$ and $N=\sum _{i=1}^n c_i$ .

For variance estimation, we again use the induced smoothing approach. Let

(\tilde{\bm{\beta }}^\dagger,\tilde{\Sigma }^\dagger)

be the solution to

\begin{align*} \tilde{{\bf U}}_{n}^\dagger (\bm{\beta },\Sigma,\tau) = n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik} {\left\lbrace \hat{w}_{ik}\Phi {\left(-\dfrac{\tilde{T}_{ik} - {\bf x}_{ik}^T\bm{\beta }}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \right)} - \tau \right\rbrace} \end{align*}

at convergence. Then, the variance–covariance matrix of the limiting normal distribution of

n^{1/2}(\tilde{\bm{\beta }}^\dagger -\bm{\beta }_0)

can be approximated by

(\tilde{A}_{n}^\dagger)^{-1}\tilde{\Gamma }_{n}^\dagger (\tilde{A}_{n}^\dagger)^{-1}

(\tilde{\bm{\beta }}^\dagger,\tilde{\Sigma }^\dagger)

, where

\begin{eqnarray*} \hat{A}_{n}^\dagger (\bm{\beta },\Sigma,\tau) & =& \dfrac{\partial \tilde{{\bf U}}_{n}^\dagger (\bm{\beta },\Sigma,\tau)}{\partial \bm{\beta }} \approx n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} \phi {\left(-\dfrac{\tilde{T}_{ik} - {\bf x}_{ik}^T\bm{\beta }}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \right)} \\ && \times\ \dfrac{{\bf x}_{ik}{\bf x}_{ik}^T}{\sqrt {{\bf x}_{ik}^T\Sigma {\bf x}_{ik}}} \end{eqnarray*}

and

\begin{align*} \begin{split} \hat{\Gamma }_n^\dagger (\bm{\beta },\tau) =&\, n^{-1} \sum _{i=1}^n \eta _i \sum _{k=1}^{c_i} {\bf x}_{ik}^{\otimes 2} {\left[ \hat{w}_{ik}I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta }) -\tau \right]}^2\\ &-n^{-1}\int _{-\infty }^\infty \frac{d\tilde{N}^R(u)}{\tilde{Y}^2(u)} \lbrace \tilde{B}^{R}(\bm{\beta },u)\rbrace ^{\otimes 2}\\ & +\, n^{-1}\int _{-\infty }^\infty \frac{d\tilde{N}^L(u)}{(N+1-\tilde{Y}(u))^2} \lbrace \tilde{B}^{L}(\bm{\beta },u)\rbrace ^{\otimes 2} \end{split} \end{align*}

with

\tilde{B}^{R}(\bm{\beta },u) = \sum _{i=1}^n \sum _{k=1}^{c_i} \eta _i \hat{w}_{ik} Y_{ik}(u) {\bf x}_{ik} I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta })

and

\tilde{B}^{L}(\bm{\beta },u) = \sum _{i=1}^n\sum _{k=1}^{c_i} \eta _i \hat{w}_{ik} \lbrace 1-Y_{ik}(u)\rbrace {\bf x}_{ik} I(\tilde{T}_{ik}\le {\bf x}_{ik}^T\bm{\beta })

. The iterative procedure, described in Section 2.4, can also be used to approximate the variance of

\hat{\bm{\beta }}^\dagger

5 Simulation Results

5.1 Univariate Partially Interval-Censored Data

This section presents extensive simulation results under various partial interval-censoring scenarios to evaluate the finite-sample properties of the proposed IPCW and AIPCW estimators. All simulations here involve two covariates,

{\bf x}= (x_1,x_2)

, where

x_1 \sim U(-0.7, 1.5)

and

x_2 \sim \text{Bernoulli}(0.5)

. The data-generating model is

\begin{equation*} T = 10 + \beta _1(\tau) x_1 + \beta _2(\tau)x_2 + \sigma ({\bf x})(e(\tau)-q(\tau)), \end{equation*}

where

\bm{\beta }_0(\tau)=(\beta _{10}(\tau),\beta _{20}(\tau))^T=(1,1)^T

, and

\sigma ({\bf x})=0.8-0.1x_2

. The error term

e(\tau)

follows standard normal, N(0,1), or extreme-value, EV(0,1), distribution and is adjusted by its quantile level

\tau =0.3

and 0.5, satisfying

P(e(\tau) &lt; q(\tau))=\tau

. To create DC data, the left- and right-censoring variables are generated as

L \sim 10+U(-4.2, c_L)

and

R \sim L

U(4.1, c_R)

, respectively, where two constants

(c_L, c_R)

are varied to yield the desired rates of exact, left-censored and right-censored observations approximately as

(75\%, 12.5\%, 12.5\%)

and

(65\%, 17.5\%, 17.5\%)

. To generate PIC data, the censoring time

C

is first simulated from

e^{C}\sim \text{Uniform}(30,50)

. For each subject, a sequence of

K

examination times

(W_{1},\ldots,W_{K})

is generated as

e^{W_{k}} = e^{W_{k-1}}+\text{Exp}(1)

, where

K&gt;0

is the largest integer that satisfies

-\infty \equiv W_0&lt;W_1&lt;\cdots &lt;W_K\le C

. The interval

(U,V)

that contains

T_i

is defined as

U=\max _k\lbrace W_{k}: W_{k} \le T\rbrace

and

V=\min _k\lbrace W_{k}: W_{k} \ge T\rbrace)

. To mix exact and interval-censored data, we generate

\Delta \in \lbrace 0,1\rbrace

from

P(\Delta =1|{\bf x}_i)=p_0-0.1I(x_{1}&lt;0.8)

, where

p_0\in (0.1,1)

is set to yield approximately 65%, or 75% exact observations of the failure time data as before. The log survival time can be predicted negatively, for which the estimated survival times are strictly positive. Tables 1–4 demonstrate that the observed biases are predominantly negative, implying that our procedure tends to slightly underestimate the regression parameters. However, this tendency is not severe and becomes negligible as the sample size increases. In fact, this pattern is quite common with IPCW-based methods under nonparametric estimation of right-censored data.

TABLE 1. Simulation results summarizing the finite-sample properties of the proposed IPCW QR estimator

\hat{\bm{\beta }}(\tau)

\tau =0.3

and 0.5, under univariate DC and PIC data, with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times observed is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, ASE = average of standard errors, and CP = 95% coverage probability.

Data	Error	Exact (%)	$\tau$	Par	Bias	SSE	ASE	CP	Bias	SSE	ASE	CP
					$n=200$				$n = 400$
DC	N(0,1)	75%	0.3	$\beta _1$	−0.003	0.132	0.142	0.956	−0.009	0.093	0.099	0.947
			0.3	$\beta _2$	−0.001	0.173	0.181	0.953	−0.002	0.115	0.127	0.959
			0.5	$\beta _1$	−0.003	0.117	0.145	0.967	−0.006	0.085	0.101	0.965
			0.5	$\beta _2$	−0.001	0.159	0.184	0.967	0.000	0.107	0.129	0.979
		65%	0.3	$\beta _1$	−0.005	0.140	0.153	0.959	−0.013	0.101	0.108	0.948
			0.3	$\beta _2$	−0.003	0.188	0.194	0.957	−0.008	0.121	0.137	0.959
			0.5	$\beta _1$	−0.004	0.123	0.157	0.978	−0.008	0.088	0.109	0.974
			0.5	$\beta _2$	0.000	0.163	0.198	0.974	−0.004	0.112	0.138	0.985
	EV(0,1)	75%	0.3	$\beta _1$	−0.003	0.132	0.137	0.938	−0.002	0.087	0.098	0.956
			0.3	$\beta _2$	−0.001	0.162	0.175	0.951	−0.001	0.116	0.125	0.953
			0.5	$\beta _1$	−0.006	0.143	0.165	0.965	−0.001	0.097	0.117	0.972
			0.5	$\beta _2$	0.004	0.173	0.210	0.977	0.000	0.128	0.149	0.967
		65%	0.3	$\beta _1$	−0.010	0.140	0.149	0.946	−0.008	0.093	0.106	0.964
			0.3	$\beta _2$	−0.006	0.171	0.189	0.957	−0.007	0.127	0.135	0.955
			0.5	$\beta _1$	−0.011	0.155	0.182	0.972	−0.005	0.106	0.130	0.972
			0.5	$\beta _2$	−0.002	0.183	0.232	0.976	−0.002	0.138	0.164	0.961
PIC	N(0,1)	75%	0.3	$\beta _1$	0.027	0.121	0.134	0.958	0.023	0.090	0.093	0.938
			0.3	$\beta _2$	−0.040	0.158	0.166	0.945	−0.036	0.113	0.117	0.940
			0.5	$\beta _1$	0.057	0.127	0.142	0.948	0.050	0.092	0.099	0.936
			0.5	$\beta _2$	−0.047	0.168	0.176	0.946	−0.042	0.117	0.123	0.944
		65%	0.3	$\beta _1$	0.027	0.133	0.143	0.948	0.020	0.092	0.100	0.935
			0.3	$\beta _2$	−0.049	0.163	0.177	0.953	−0.046	0.119	0.125	0.946
			0.5	$\beta _1$	0.060	0.146	0.157	0.943	0.052	0.101	0.107	0.933
			0.5	$\beta _2$	−0.054	0.178	0.192	0.945	−0.050	0.128	0.132	0.937
	EV(0,1)	75%	0.3	$\beta _1$	0.024	0.123	0.131	0.948	0.027	0.086	0.092	0.941
			0.3	$\beta _2$	−0.039	0.152	0.163	0.954	−0.041	0.107	0.115	0.940
			0.5	$\beta _1$	0.059	0.159	0.167	0.948	0.056	0.106	0.115	0.939
			0.5	$\beta _2$	−0.055	0.190	0.208	0.961	−0.057	0.140	0.144	0.934
		65%	0.3	$\beta _1$	0.026	0.137	0.147	0.953	0.029	0.099	0.104	0.943
			0.3	$\beta _2$	−0.046	0.169	0.183	0.948	−0.055	0.121	0.129	0.930
			0.5	$\beta _1$	0.068	0.180	0.198	0.958	0.068	0.125	0.133	0.931
			0.5	$\beta _2$	−0.067	0.226	0.244	0.957	−0.082	0.157	0.165	0.943

TABLE 2. Simulation results comparing the finite-sample properties of the IPCW estimator (

\hat{\bm{\beta }}

) to the augmented IPCW estimator (

\hat{\bm{\beta }}^*

) at

\tau =0.3

and 0.5, under univariate DC data with errors distributed as N(0,1) or EV(0,1), where the proportion of exact failure times is 65%, or 75%. Here, Par = parameters, Bias = empirical bias, SSE = sampling standard error, MSE = mean-squared error, and RE = relative efficiency.

Error	Exact (%)	$\tau$	Par	Bias	SSE	MSE	Bias	SSE	MSE	RE
				IPCW			AIPCW
N(0,1)	75%	0.3	$\beta _1$	−0.003	0.132	0.017	−0.003	0.130	0.017	0.970
	75%	0.3	$\beta _2$	−0.001	0.173	0.030	0.000	0.171	0.029	0.977
		0.5	$\beta _1$	−0.003	0.117	0.014	−0.003	0.117	0.014	1.000
		0.5	$\beta _2$	−0.001	0.159	0.025	−0.001	0.157	0.025	0.975
	65%	0.3	$\beta _1$	−0.005	0.140	0.020	−0.005	0.139	0.019	0.986
		0.3	$\beta _2$	−0.003	0.188	0.035	−0.002	0.185	0.034	0.968
		0.5	$\beta _1$	−0.004	0.123	0.015	−0.005	0.122	0.015	0.984
		0.5	$\beta _2$	0.000	0.163	0.027	0.000	0.162	0.026	0.988
EV(0,1)	75%	0.3	$\beta _1$	−0.003	0.132	0.017	−0.002	0.132	0.017	1.000
		0.3	$\beta _2$	−0.001	0.162	0.026	−0.002	0.160	0.026	0.976
		0.5	$\beta _1$	−0.006	0.143	0.020	−0.005	0.141	0.020	0.972
		0.5	$\beta _2$	0.004	0.173	0.030	0.002	0.173	0.030	1.000
	65%	0.3	$\beta _1$	−0.010	0.140	0.020	−0.010	0.139	0.019	0.986
		0.3	$\beta _2$	−0.006	0.171	0.029	−0.008	0.170	0.029	0.989
		0.5	$\beta _1$	−0.011	0.155	0.024	−0.011	0.153	0.024	0.974
		0.5	$\beta _2$	−0.002	0.183	0.033	−0.002	0.184	0.034	1.011

TABLE 3. Simulation results comparing the finite-sample properties of the proposed IPCW QR estimators for univariate DC data at

\tau =0.3

and 0.5, where the Beran (1981)'s local Kaplan–Meier (“IPCW-KM”) and survival random forests (“IPCW-RF”) methods are used to approximate the left- and right-censoring distributions given covariates.

$n$	Error	Exact (%)	$\tau$	Par	Bias	SSE	MSE	Bias	SSE	MSE	RE
					IPCW-KM			IPCW-RF
200	N(0,1)	75%	0.3	$\beta _1$	0.004	0.131	0.017	−0.045	0.142	0.022	1.292
			0.3	$\beta _2$	−0.060	0.170	0.033	−0.030	0.181	0.034	1.036
			0.5	$\beta _1$	−0.004	0.120	0.014	−0.049	0.124	0.018	1.233
			0.5	$\beta _2$	−0.076	0.160	0.031	−0.031	0.164	0.028	0.888
		65%	0.3	$\beta _1$	0.014	0.142	0.020	−0.039	0.154	0.025	1.240
			0.3	$\beta _2$	−0.067	0.185	0.039	−0.027	0.195	0.039	1.001
			0.5	$\beta _1$	0.002	0.126	0.016	−0.060	0.131	0.021	1.307
			0.5	$\beta _2$	−0.073	0.170	0.034	−0.038	0.172	0.031	0.906
	EV(0,1)	75%	0.3	$\beta _1$	0.000	0.126	0.016	−0.035	0.131	0.018	1.158
			0.3	$\beta _2$	−0.056	0.154	0.027	−0.026	0.156	0.025	0.931
			0.5	$\beta _1$	−0.004	0.141	0.020	−0.057	0.144	0.024	1.205
			0.5	$\beta _2$	−0.082	0.172	0.036	−0.039	0.167	0.029	0.810
		65%	0.3	$\beta _1$	0.005	0.138	0.019	−0.045	0.143	0.022	1.179
			0.3	$\beta _2$	−0.073	0.167	0.033	−0.034	0.169	0.030	0.895
			0.5	$\beta _1$	−0.001	0.151	0.023	−0.064	0.156	0.028	1.247
			0.5	$\beta _2$	−0.082	0.182	0.040	−0.047	0.179	0.034	0.860
400	N(0,1)	75%	0.3	$\beta _1$	0.007	0.093	0.009	−0.093	0.112	0.021	2.437
			0.3	$\beta _2$	−0.053	0.113	0.016	−0.055	0.124	0.018	1.181
			0.5	$\beta _1$	0.002	0.086	0.007	−0.082	0.096	0.016	2.154
			0.5	$\beta _2$	−0.069	0.112	0.017	−0.034	0.116	0.015	0.844
		65%	0.3	$\beta _1$	0.014	0.102	0.011	−0.083	0.115	0.020	1.898
			0.3	$\beta _2$	−0.064	0.120	0.018	−0.049	0.131	0.020	1.058
			0.5	$\beta _1$	0.009	0.091	0.008	−0.110	0.097	0.022	2.572
			0.5	$\beta _2$	−0.067	0.117	0.018	−0.071	0.118	0.019	1.043
	EV(0,1)	75%	0.3	$\beta _1$	0.008	0.084	0.007	−0.062	0.095	0.013	1.807
			0.3	$\beta _2$	−0.052	0.112	0.015	−0.028	0.118	0.015	0.965
			0.5	$\beta _1$	0.007	0.098	0.010	−0.082	0.094	0.016	1.612
			0.5	$\beta _2$	−0.074	0.130	0.022	−0.039	0.123	0.017	0.744
		65%	0.3	$\beta _1$	0.013	0.089	0.008	−0.079	0.099	0.016	1.983
			0.3	$\beta _2$	−0.066	0.121	0.019	−0.046	0.122	0.017	0.895
			0.5	$\beta _1$	0.014	0.101	0.010	−0.099	0.100	0.020	1.904
			0.5	$\beta _2$	−0.075	0.140	0.025	−0.067	0.130	0.021	0.848

TABLE 4. Simulation results comparing the finite-sample properties of the IPCW estimator, corresponding to the unadjusted (weight

\eta _i = 1

), and adjusted (weight

\eta _i=1/c_i

) methods for multivariate DC data at

\tau =0.3

and 0.5, where the numbers of clusters are 50 and 100.

Cluster	Error	Exact (%)	$\tau$	Par	Bias	SSE	ASE	CP	Bias	SSE	ASE	CP
					Unadjusted ( $\eta _i=1$ )				Adjusted ( $\eta _i=1/c_i$ )
$n = 50$	N(0,1)	75%	0.3	$\beta _1$	−0.008	0.221	0.241	0.955	−0.001	0.237	0.239	0.936
			0.3	$\beta _2$	−0.034	0.271	0.301	0.962	−0.025	0.294	0.297	0.938
			0.5	$\beta _1$	−0.011	0.261	0.282	0.960	−0.002	0.287	0.279	0.939
			0.5	$\beta _2$	−0.040	0.324	0.350	0.973	−0.031	0.343	0.345	0.957
		65%	0.3	$\beta _1$	−0.014	0.246	0.261	0.945	−0.006	0.262	0.260	0.940
			0.3	$\beta _2$	−0.036	0.297	0.326	0.965	−0.028	0.319	0.323	0.941
			0.5	$\beta _1$	−0.012	0.300	0.319	0.961	0.000	0.324	0.317	0.947
			0.5	$\beta _2$	−0.046	0.368	0.396	0.967	−0.034	0.392	0.392	0.955
	EV(0,1)	75%	0.3	$\beta _1$	−0.023	0.232	0.255	0.955	−0.016	0.246	0.252	0.945
			0.3	$\beta _2$	−0.011	0.309	0.321	0.948	−0.004	0.325	0.317	0.934
			0.5	$\beta _1$	−0.033	0.299	0.323	0.950	−0.024	0.320	0.319	0.938
			0.5	$\beta _2$	−0.013	0.380	0.406	0.965	−0.003	0.399	0.397	0.946
		65%	0.3	$\beta _1$	−0.024	0.252	0.273	0.953	−0.018	0.267	0.270	0.937
			0.3	$\beta _2$	−0.014	0.335	0.344	0.946	−0.003	0.356	0.339	0.932
			0.5	$\beta _1$	−0.032	0.341	0.374	0.943	−0.021	0.361	0.371	0.942
			0.5	$\beta _2$	−0.019	0.438	0.469	0.968	−0.009	0.462	0.461	0.957
$n = 100$	N(0,1)	75%	0.3	$\beta _1$	−0.015	0.149	0.168	0.956	−0.008	0.162	0.167	0.943
			0.3	$\beta _2$	−0.020	0.200	0.211	0.954	−0.016	0.210	0.210	0.940
			0.5	$\beta _1$	−0.015	0.173	0.194	0.977	−0.005	0.185	0.193	0.952
			0.5	$\beta _2$	−0.026	0.227	0.244	0.965	−0.021	0.241	0.242	0.945
		65%	0.3	$\beta _1$	−0.021	0.163	0.182	0.964	−0.014	0.179	0.181	0.946
			0.3	$\beta _2$	−0.025	0.221	0.229	0.956	−0.017	0.233	0.227	0.937
			0.5	$\beta _1$	−0.017	0.197	0.218	0.969	−0.010	0.212	0.218	0.949
			0.5	$\beta _2$	−0.029	0.264	0.275	0.965	−0.020	0.275	0.273	0.956
	EV(0,1)	75%	0.3	$\beta _1$	−0.019	0.164	0.178	0.959	−0.016	0.179	0.178	0.946
			0.3	$\beta _2$	−0.014	0.216	0.226	0.958	−0.010	0.229	0.224	0.941
			0.5	$\beta _1$	−0.019	0.206	0.226	0.961	−0.016	0.221	0.225	0.941
			0.5	$\beta _2$	−0.022	0.267	0.283	0.954	−0.016	0.280	0.280	0.941
		65%	0.3	$\beta _1$	−0.023	0.174	0.191	0.957	−0.018	0.191	0.190	0.936
			0.3	$\beta _2$	−0.018	0.230	0.242	0.961	−0.015	0.246	0.239	0.930
			0.5	$\beta _1$	−0.021	0.238	0.259	0.961	−0.015	0.257	0.257	0.947
			0.5	$\beta _2$	−0.019	0.315	0.326	0.954	−0.012	0.333	0.323	0.931

The simulation results for DC and PIC are summarized in Table 1, which includes empirical bias (Bias), sampling standard error (SSE), an average of standard error estimates (ASE), and coverage probabilities (CP) of the $95\%$ confidence intervals for $\hat{\bm{\beta }}$ , based on 1000 random data sets with sample sizes $n = 200$ and 400. Overall, the proposed estimator is unbiased, and the standard error estimates from induced smoothing are close to their empirical estimates. The empirical CPs agree well with the nominal level approximated by the normal distribution. The estimated standard errors are slightly larger than the sampling errors, but their gaps appear to decrease as the sample size increases. Next, the performance of the IPCW and AIPCW estimators is compared for DC data. In addition to Bias and SSE, Table 2 presents the mean-squared error (MSE) of IPCW ( $\hat{\bm{\beta }}$ ) and AIPCW ( $\hat{\bm{\beta }}^*$ ), along with their relative efficiency (RE), defined as $\text{MSE}(\hat{\bm{\beta }}^*)/\text{MSE}(\hat{\bm{\beta }})$ . We observe that $\hat{\bm{\beta }}^*$ is unbiased and significantly more efficient than $\hat{\bm{\beta }}$ . The efficiency gain in this setting is meaningful, though modest, and could potentially increase with the availability of further time-dependent or longitudinal information (Gorfine, Goldberg, and Ritov 2017).

Table 3 reports additional simulation results under univariate DC data when the censoring distributions also involve covariates. We let $L \! \sim \! 10 + U(-4.2, c_L)$ and $R \sim L$ + $(1-0.2x_1-0.2x_2) \times U(4.1, c_R)$ , while other simulation configurations remain the same as before. To account for this covariate-conditional censoring situation, we apply local KM (Beran 1981) and survival random forests (Ishwaran et al. 2008) methods, as mentioned in Remark 2, to approximate $S_L(\cdot |{\bf x})$ and $S_R(\cdot |{\bf x})$ . The corresponding estimators are referred to as IPCW-KM and IPCW-RF, respectively. Overall, both estimators produce virtually unbiased results that are robust to the effect of covariates on censoring distributions. Table 3 also presents RE, defined as the ratio of the MSE of IPCW-RF to that of IPCW-KM. In the present setting, it seems that the IPCW-KM estimator is slightly more efficient than the IPCW-RF estimator. However, if the censoring distributions involve many covariates such that nonparametric kernel-smoothing is not feasible, the random forests method would be a more viable and reliable alternative.

5.2 Multivariate Partially Interval-Censored Data

Next, we present the simulation results under clustered multivariate DC data. We set the number of clusters as either $n=50$ or $n=100$ . The cluster size $c_i$ is determined by $c_i=(d/10)+3$ , if $d=0,10,\ldots,90$ satisfies $l_d \le v_{i} < l_{d+10}$ for $v_i\sim N(0,1)$ , otherwise we let $c_i=5$ , where $l_d$ represents the $d$ th percentile of $v_{i}$ . In this setup, the cluster size $c_i$ ranges from 3 to 11 members and the total number of members is about 200 when $n = 50$ , and 400 when $n = 100$ . The data-generating model is given by $T_{ik} = 10 + \beta _1(\tau) x_{1ik} + \beta _2(\tau)x_{2ik} + v_i + e_{ik}(\tau)-q_{ik}(\tau)$ for subject $k=1,2,\ldots,c_i$ in cluster $i=1,2,\ldots,n$ , where $(\beta _{01},\beta _{02})^T=(1,1)^T$ , and $e_{ik}(\tau)$ follows N(0,1) and EV(0,1) distribution, satisfying $P(e_{ik}(\tau) < q_{ik}(\tau))=\tau$ . As in the first simulation, we let $L \sim 10 + U(-4.2, c_L)$ and $R \sim L$ + $U(4.1, c_R)$ . We consider $\eta _i=1$ (unadjusted) and $\eta _i=1/c_i$ (adjusted); the latter approach may calibrate possible informativeness of cluster sizes on event time. Table 4 shows that the cluster size adjustment with $\eta _i=1/c_i$ could lead to slightly lower biases but a bit more inflated standard errors. When the cluster size is adjusted, and censoring rates are higher, the estimated standard errors are closer to the empirical standard errors, resulting in more stabilized CPs. When cluster sizes are highly informative to time-to-event, letting $\eta _i=1/c_i^\alpha$ for some $0<\alpha \le 1$ would be beneficial to achieve a more efficient and robust estimation (Wang and Zhao 2008).

6 Application: mCRC Data

In this section, we apply the proposed method to a data set from a multicentered, randomized, phase III mCRC clinical trial (Peeters et al. 2010). This study aimed to investigate the efficacy and safety of second-line panitumumab plus FOLFIRI versus FOLFIRI alone, concerning patients' survival after the failure of initial treatment for mCRC. Panitumumab is a fully human, antiepidermal growth factor receptor, monoclonal antibody that improves PFS in chemotherapy-refractory mCRC. It was often prescribed with FOLFIRI because it does not benefit clinically alone. From June 2006 to March 2008, 1186 patients who failed first-line treatment of mCRC were randomly assigned (1:1) to panitumumab 6.0 mg/kg plus FOLFIRI versus FOLFIRI alone every 2 weeks. The coprimary end points of PFS and overall survival (OS) were independently tested and prospectively analyzed by KRAS status.

Our analysis focused on 855 patients concerning PFS, for whom treatment and KRAS status were available: 428 (50.0%) and 427 (50.0%) patients received FOLFIRI (coded as 0) and panitumumab + FOLFIRI (coded as 1), respectively, while 474 (55.4%) had wild-type (WT) KRAS tumors (coded as 1) and 381 (44.5%) had mutant (MT) KRAS tumors (coded as 0). Eligible patients, aged 18 or older and diagnosed with adenocarcinoma of the colon or rectum, with an Eastern Cooperative Oncology Group (ECOG) performance status of 0, 1, or 2, were included. They had received only one prior chemotherapy regimen for mCRC, with radiographically confirmed disease progression occurring during or within 6 months of the prior first-line chemotherapy. Patients meeting these criteria underwent central analysis of EGFR and biomarkers with approval from an independent ethics committee before any study-related procedures were initiated. Patients in this study were followed for safety for $\ge$ 30 days after the last study drug administration and for survival every 3 months. Due to this nature of data administration, the disease progression-free period in each patient was subject to various types of interval-censoring: 168 (19.6%), 329 (38.5%), and 306 (35.8%) patients were left-censored, interval-censored, and right-censored, respectively. Exact disease progression times were known only for 52 (6.1%) patients.

Since this data set was collected from 185 clinic centers with a range of 1–23 patients in each center, it can be understood as general multivariate PIC data. Figure 1, computed using a modified self-consistency approach for general interval-censored data (Choi, Kim, and Choi 2021), displays the nonparametric PFS curves corresponding to panitumumab + FOLFIRI versus and FOLFIRI alone groups. We observe that panitumumab + FOLFIRI can achieve higher survival rates than FOLFIRI alone during the first year, but the two KM curves become almost identical about 1.25 years into treatment. Previously, a Bayesian evaluation using standard univariate PH model (Pan, Cai, and Wang 2020) (and ignoring cluster effects) revealed that the treatment effect is statistically significant (Coef = –0.215; CI = –0.384, –0.046), while the KRAS status is not significant (Coef = 0.163; CI = –0.006, 0.332).

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Nonparametric Kaplan–Meier curves (based on a self-consistency equation) estimating progression-free survival probabilities, for the “panitumumab+FOLFIRI” versus “FOLFIRI” groups in the mCRC data.

By considering the potential correlation within each clinical site, we alternatively fitted the following multivariate CQR model for the log-transformed PFS with two covariates:

\begin{align*} \text{log-PFS}_{ik} = \beta _0(\tau) + \beta _1(\tau) \times \text{TRT}_{ik} & +\beta _2(\tau)\times \text{KRAS}_{ik} + e_{ik}(\tau),\nobreakspace\\ i & = 1,\ldots,185,\nobreakspace k=1,\ldots,c_i, \end{align*}

via the proposed IPCW approach, with the cluster-size adjustment weights

\eta _i=1

(unadjusted), or

\eta _i=1/c_i

(adjusted). As a preliminary analysis, we first applied Cox's PH model, respectively, to the left endpoint (

U

) and right endpoint (

V

) of observed time intervals to check whether their distributions depend on any covariates. We found that the effects of the two covariates on both

U

and

V

were distinctly significant at the significance level of 0.1. Furthermore, we used Beran (1981)'s local KM estimates to compute the desired individual weights given the covariates. Standard errors in this case were computed via a cluster-wise bootstrapping method with 100 bootstrap samples.

Figure 2 presents the point estimates and 95% Wald-type confidence intervals for two covariates at different quantile levels of $\tau \in [0.1,0.9]$ , when the cluster size is adjusted or not. Overall, panitumumab + FOLFIRI does not improve PFS significantly at most quantile levels, and also the difference in KRAS status is not statistically significant, with or without the adjustment of the cluster effect. Panitumumab + FOLFIRI appears to be more effective than FOLFIRI alone in controlling disease progression only at low quantile levels. This observation can also be confirmed by the KM plot in Figure 1, which shows that panitumumab + FOLFIRI can improve PFS only for the first year after treatment. Note that the analysis results do not change significantly whether or not the cluster size is adjusted. However, comparisons of cluster size adjustments for treatment (a vs. b), and KRAS status (c vs. d) in Figure 2 reveal that the quantile coefficients are much more smoothly distributed when the cluster size is adjusted. This implies that some heterogeneity may exist across different clinical sites, and the cluster size adjustment would help achieve standardized results.

A drawback of the proposed IPCW estimator is that the estimation procedure only utilizes complete survival time data, and the information from censored observations is used to compute the inverse weight but does not effectively contribute to estimation and statistical efficiency. Since the proportion of individuals with exact PFS time is only 6.1% in this data set, the IPCW approach is expected to produce unbiased results but with low statistical precision. This may partly explain why our results for treatment are slightly different from those of the univariate Cox PH analysis. One might consider an augmentation-based estimation method, but with censoring variables that depend on baseline covariates (as in our case), its derivation is too complicated, and practically not feasible. Nevertheless, our CQR procedure is computationally reliable and not much sensitive to the sample size.

7 Discussion

This paper proposes an IPCW-based estimation method for conducting QR on partially interval-censored data, primarily focusing on DC and PIC endpoints. We demonstrate that the nonparametric left-censored survivor function can be estimated with conventional KM approaches by reading survival data backward in time. Furthermore, we develop an augmentation-based estimation and extend the method to accommodate multivariate partially interval-censored data. The proposed methods can easily be implemented with existing computation packages for QR or $l_1$ -type linear programming. Although we restrict our attention to interval-CQR at a single quantile level, the proposed weighting scheme can be immediately applied to other settings with interval-censoring, such as medical costs (Bang and Tsiatis 2002), competing risks (Choi, Kang, and Huang 2018), time-dependent covariates (Gorfine, Goldberg, and Ritov 2017), AFT model (Komárek and Lesaffre 2008; Gao, Zeng, and Lin 2017; Choi, Kim, and Choi 2021), and composite QR (Zou and Yuan 2008), and so forth.

As pointed out by a reviewer, the quantile level ( $\tau$ ) of interest will be determined by investigators. Depending on the nature of a study, one can choose a desired $\tau$ , but quantiles at extreme levels, such as $\tau =0.01$ or 0.99, may not be well-estimated unless sample sizes are justifiably large. Even though the data are subject to a certain type of censoring, the underlying distribution function can be well-identified, and the quantile point corresponding to each quantile level can be estimated unless the censoring rate is too heavy, or presence of other complications, such as competing risks, and so forth.

One of the necessary requirements of the IPCW-based approach is a nonnegligible proportion of exact failure time observations, which is also crucial in establishing asymptotic results of the proposed estimator and constructing computational algorithms. This is because the IPCW approach typically compensates for censored subjects by giving more weight to subjects with similar characteristics who are not censored. Thus, the proposed method may not be applied to fully interval-censored or current status data without known failure time data. Our experience is that our QR procedure is not much sensitive to the level of censoring rates, but high censoring rates may lead to loss of statistical precision. In this case, the augmentation-based estimator can perform better than the IPCW estimator in both estimation and statistical inference, but not much significantly. In our data example, the proportion of the “effective” data samples was only 6.1%, and as a result, the IPCW-based estimators are less statistically efficient than Cox PH estimators.

Recently, De Backer, Ghouch, and Van Keilegom (2019) proposed an alternative estimating approach for CQR with an adapted quantile loss function. For right-censored data, they argued that a consistent estimator of the QR coefficient

\bm{\beta }(\tau)

could be obtained by minimizing the following objective function:

\begin{eqnarray} \tilde{L}_n(\bm{\beta },\tau)=n^{-1/2} \sum _{i=1}^n{\left\lbrace \rho _\tau (\tilde{T}_i-{\bf x}_i^T\bm{\beta })-(1-\tau) \int _{-\infty }^{{\bf x}_i^T\bm{\beta }} \hat{F}_R(t|{\bf x}_i) dt \right\rbrace} . \nonumber\\ \end{eqnarray}

(13)

Notice that formulation (13) allows us to extract the information of every observation at hand, even if confronted with incompleteness from right-censoring. Therefore, one could expect that the solution to (13) will be more efficient than the basic IPCW estimator, especially when the censoring proportion is large. In the same spirit, we may construct an alternative quantile loss function for DC data as

\begin{eqnarray} L_n(\bm{\beta },\tau) &=& n^{-1/2} \displaystyle \sum _{i=1}^n{\left[\rho _\tau (\tilde{T}_i-{\bf x}_i^T\bm{\beta })- (1-\tau)\displaystyle \int _{-\infty }^{{\bf x}_i^T\bm{\beta }}\hat{F}_R(t|{\bf x}_i)dt\right.}\nonumber\\ &&{\left. +\, \tau \displaystyle \int _{-\infty }^{{\bf x}_i^T\bm{\beta }} \hat{S}_L(t|{\bf x}_i)dt \right]}, \end{eqnarray}

(14)

where we leverage the fact that under the independence assumption of

T \perp \!\!\!\perp G|{\bf x}

, the following equality holds:

\begin{align*} E[I(\tilde{T}&gt;t)|{\bf x}] &= E[\lbrace 1-\tau I(L&lt;t)\rbrace I(R&gt;t)|{\bf x}] \\& =\,(1-\tau)S_R(t|{\bf x})+\tau S_L(t|{\bf x}) \end{align*}

for

\tilde{T}=(T\vee L)\wedge R

. The theoretical and empirical properties of the new estimator from the adapted quantile loss function (14) deserve further investigation and will be studied in future research.

Acknowledgments

The authors thank the anonymous AE and two reviewers, whose insightful comments led to a substantially improved presentation of the manuscript. The colorectal cancer data were derived based on raw data sets obtained from www.projectdatasphere.org, which is maintained by Project Data Sphere, LLC. Neither Project Data Sphere, LLC nor the owner(s) of any information from the website have contributed to, approved, or are in any way responsible for the contents of this publication. The research of Dr. T. Choi was supported by a grant from the National Research Foundation (NRF) of Korea (RS-2024-00340298). The research of Dr. S. Choi was supported by a Korea University grant (K2201231) and a grant from the National Research Foundation (NRF) of Korea (2022M3J6A1063595, 2022R1A2C1008514). Dr. Bandyopadhyay acknowledges partial funding support from the grants awarded by the US National Institutes of Health (R21DE031879, R01DE031134).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A: Asymptotic Results

This section provides asymptotic results of the proposed estimator for doubly censored (DC) data. We first impose the following regularity conditions:

(C1) The joint distribution function of $(L,R)$ is continuous. There exists $v\in (0,\infty)$ such that $P(R-L > v |{\bf x}) =1$ . There also exist $-\infty < v_1 \le v_2 \le v<\infty$ such that $P(v_1 < L\le v_2|{\bf x}) = 1$ and $P(R\le v | {\bf x}) = 1$ .
(C2) The covariate ${\bf x}$ is uniformly bounded, that is, $\sup _i \Vert {\bf x}_i \Vert <\infty$ .
(C3) (i) The quantile coefficient $\bm{\beta }_0(\tau)$ is Lipschitz continuous for $\tau \in [\tau _L, \tau _R]$ ; (ii) $f(t|{\bf x})$ is bounded above uniformly in $t$ and ${\bf x}$ , where $f(t|{\bf x}) = dF(t|{\bf x})/dt$ .
(C4) For some $\rho _0 > 0$ and $c_0 > 0$ , $\inf _{\bm{\beta }\in \mathbb {B}(\rho _0)}\text{eigmin} \nobreakspace ({\bf A}\lbrace \bm{\beta }(\tau)\rbrace) \ge c_0$ , where $\mathbb {B}(\rho) = \lbrace \inf _{\tau \in [\tau _L,\tau _R]} \Vert \bm{\beta }(\tau)-\bm{\beta }_0(\tau)\Vert \le \rho:\bm{\beta }\in \mathbb {R}^{p} \rbrace$ and ${\bf A}\lbrace \bm{\beta }(\tau)\rbrace = E[{\bf x}^{\otimes 2} f({\bf x}^T\bm{\beta }|{\bf x})]$ where eigmin $(\cdot)$ denotes the minimum eigenvalue of a matrix.

In the following, we omit $\tau$ in $\hat{\bm{\beta }}(\tau)$ for notation simplicity but bear in mind that coefficients are all $\tau$ -specific. To avoid tail instability, we restrict the possible range of $\tau$ as $0<\tau _L\le \tau \le \tau _R<1$ . We first fix several notations for establishing our asymptotic results. For right-censoring, define $N_i^R(t) = I(\tilde{T}_i \le t, \delta _{2i}=1), Y_i(t) = I(\tilde{T}_i \ge t)$ , and $y(t) = P(\tilde{T} \ge t)$ . Then, we observe the corresponding Martingale process $M_i^R(t) = N_i^R(t) - \int _{-\infty }^tY_i(u)d\Lambda ^R(u)$ , where $\Lambda ^R(t) = \int _{-\infty }^t\lambda ^R(u)du,\nobreakspace \lambda ^R(t) = \lim _{h\rightarrow 0}P\lbrace \tilde{T}\in (t,t+h)|\tilde{T}\ge t \rbrace /h$ . For left-censoring, define $N_i^L(t) = I(\tilde{T}_i \ge t, \delta _{3i}=1)$ , with the corresponding Martingale process $M_i^L(t) = N_i^L(t) - \int _{t}^\infty \lbrace n+1-Y_i(u)\rbrace d\Lambda ^L(u)$ , where $\Lambda ^L(t) = \int _t^\infty \lambda ^L(u)du,\nobreakspace \lambda ^L(t) = \lim _{h\rightarrow 0}P\lbrace \tilde{T}\in (t-h,t)|\tilde{T}\le t \rbrace /h$ .

Theorem A.1.Under regularity conditions (C1)–(C4), $\lim _{n\rightarrow \infty } \sup _{\tau \in [\tau _L,\tau _R]} \Vert \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\Vert \rightarrow _{p} 0$ , assuming model (2) holds for $\tau \in [\tau _L,\tau _R]$ .

Proof.Define ${\bf U}_n^S(\bm{\beta },\tau) = n^{-1/2}\sum _{i=1}^n{\bm \xi }_{1,i}(\tau)$ and ${\bf U}_0(\bm{\beta },\tau) = E\lbrace n^{-1/2}{\bf U}_n^F(\bm{\beta },\tau)\rbrace$ , where ${\bm xi}_{1,i}(\tau) = {\bf x}_i \lbrace {\delta _{1i}}\lbrace S_R(\tilde{T}_i) - S_L(\tilde{T}_i) \rbrace ^{-1} I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }) - \tau \rbrace$ and ${\bf U}_n^F(\bm{\beta },\tau) = n^{-1/2}\sum _{i=1}^n{\bf x}_i\lbrace F({\bf x}_i^T\bm{\beta }|{\bf x}_i) -\tau \rbrace$ . In the sequel, we use $\sup _{\bm{\beta }}$ and $\sup _{\tau }$ to denote supremum taken over $\bm{\beta }\in \mathbb {R}^{p}$ and $\tau \in [\tau _L,\tau _R]$ , respectively.

First, by condition (C1), for every $0< r<1/2$ , we have $\sup _{t<v}| \hat{S}_R(t) -S_R(t)| = o_p(n^{-1/2+ r})$ and $\sup _{t<v}| \hat{S}_L(t) -S_L(t)| = o_p(n^{-1/2+ r})$ . This, allied with condition (C2), implies that

\begin{equation*} \sup _{\bm{\beta },\tau } \Vert n^{-1/2}\lbrace {\bf U}_n(\bm{\beta },\tau)-{\bf U}_n^S(\bm{\beta },\tau)\rbrace \Vert = o_p(n^{-1/2+r}). \end{equation*}

Define

\mathcal {A}= \lbrace {\bf x}_i \lbrace \delta _{1i} \lbrace S_R(\tilde{T}_i) -S_L(\tilde{T}_i)\rbrace ^{-1}I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }) - \tau \rbrace:\bm{\beta }\in \mathbb {R}^{p},\tau \in [\tau _L,\tau _R]\rbrace

. This function class is Donsker and thus Glivenko–Cantelli (van der Vaart and Wellner 1996), because the class of indicator functions is Donsker and three

{\bf x}_i

S_R(\tilde{T}_i)

, and

S_L(\tilde{T}_i)

are uniformly bounded. Therefore, from the Glivenko–Cantelli theorem, we have that

\sup _{\bm{\beta },\tau }\Vert n^{-1/2} {\bf U}_n^S(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta },\tau)\Vert = o_p(1)

. Combining these two results, we obtain

\begin{equation} \sup _{\bm{\beta },\tau } \Vert n^{-1/2}{\bf U}_n(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta },\tau) \Vert = o_p(1). \end{equation}

(A.1)

Second, note that for any ${\bf b}\in \mathbb {R}^{p}$ satisfying $\Vert {\bf b}\Vert =1$ , ${\bf b}^T{\bf U}_0(\bm{\beta }_0 + {\bf b}\delta,\tau)$ is a nondecreasing function in $\delta$ . Then, for $\delta \ge \rho _0>0$ , ${\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ] \ge {\bf b}^T [{\bf U}_0(\bm{\beta }_0 + {\bf b}\rho _0, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ]\ge 0$ . By the Cauchy–Schwarz inequality and condition (C4),

\begin{align*} & \Vert {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau)\Vert ^2 \cdot \Vert {\bf b}\Vert ^2 \ge ({\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\delta, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ])^2\\ & \quad \ge ({\bf b}^T [ {\bf U}_0(\bm{\beta }_0 + {\bf b}\rho _0, \tau) - {\bf U}_0(\bm{\beta }_0,\tau) ])^2\\ & \quad = ({\bf b}^T {\bf A}(\bm{\beta }_0 + {\bf b}\rho ^*){\bf b})^2\rho _0^2 \nobreakspace \ge \nobreakspace c_0^2\rho _0^2\nobreakspace &gt;\nobreakspace 0 \end{align*}

for some

\rho ^*\in [0,\rho _0]

. Since

\bm{\beta }_0 + {\bf b}\rho ^* \in \mathbb {B}(\rho _0)

, the last above inequality follows from condition (C4). Therefore, we have

\inf _{\bm{\beta }\not\in \mathbb {B}(\rho _0)} \Vert {\bf U}_0(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta }_0,\tau)\Vert \ge c_0\rho _0

By using the fact ${\bf U}_n(\hat{\bm{\beta }},\tau) = o_p(n^{-1/2})$ , ${\bf U}_0(\bm{\beta }_0, \tau) =0$ , and (A.1), we can easily show that

\begin{equation} {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) = o_p(1), \end{equation}

(A.2)

and thus there exists an

N_0 &gt;0

such that for

n\ge N_0

\sup _\tau \Vert {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) \Vert &lt; c_0\rho _0

. Consequently, for

\tau \in [\tau _L,\tau _R]

\hat{\bm{\beta }}

belongs to the

\mathbb {B}(\rho _0)

with probability one when

n

is large enough. Moreover, using Taylor expansion of

{\bf U}_0(\hat{\bm{\beta }},\tau)

with respect to

\bm{\beta }_0

yields

\begin{equation*} \sup _\tau \Vert \hat{\bm{\beta }}- \bm{\beta }_0\Vert = \sup _\tau \big \Vert {\bf A}(\check{\bm{\beta }})^{-1} \lbrace {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau) \rbrace \big \Vert, \end{equation*}

where

\check{\bm{\beta }}

is between

\hat{\bm{\beta }}

and

\bm{\beta }_0

and thus be the element of

\mathbb {B} (\rho _0)

for a large

n

. Therefore, the desired uniform consistency can be derived by applying (A.2) and condition (C4) to the above display.

\Box

Lemma A.1.For any positive sequence $\lbrace d_n\rbrace _{n=1}^\infty$ satisfying $d_n\rightarrow 0$ ,

\begin{align*} & \lim _{n\rightarrow \infty }\sup _{\bm{\beta },\bm{\beta }^{\prime }\in \mathbb {B}(\rho _0),\Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert \le d_n} \Vert n^{-1/2}\sum _{i=1}^n {\left[{\bf x}_i \delta _{1i} \lbrace I(T_i\le {\bf x}_i^T\bm{\beta })-I(T_i\le {\bf x}_i^T\bm{\beta }^{\prime })\rbrace \right]} \\ &\! -n^{1/2} \lbrace {\bf U}_0(\bm{\beta },\tau) - {\bf U}_0(\bm{\beta }^{\prime },\tau)\rbrace \Vert = 0,\nobreakspace a.s. \end{align*}

Proof.This lemma can be proved by using the results in Alexander (1984) and similar arguments from Theorem 1 of Lai and Ying (1988). Thus, the detailed derivation is omitted. It is noted that there exist $S_{R}(\tilde{T}_i)>0$ and $S_{L}(\tilde{T}_i)>0$ such that

\begin{equation*} \text{var}({\bf x}[I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })\delta _{1i}-I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }^{\prime })\delta _{1i}] \le |S_{R}(\tilde{T}_i)|\cdot \Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert, \end{equation*}

and

\begin{equation*} \text{var}({\bf x}[I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta })\delta _{1i}-I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }^{\prime })\delta _{1i}] \le |S_{L}(\tilde{T}_i)|\cdot \Vert \bm{\beta }-\bm{\beta }^{\prime }\Vert . \end{equation*}

This would be proved using the boundedness properties of

{\bf x}

and

\mathbb {B} (\rho _0)

from conditions (C2) and (C3).

\Box

Theorem A.2.Under regularity conditions (C1)–(C4), $n^{1/2} \lbrace \hat{\bm{\beta }}(\tau) - \bm{\beta }_0(\tau)\rbrace$ weakly converges to zero-mean Gaussian process for $\tau \in [\tau _L,\tau _R]$ with covariance

\begin{equation*} \Psi (\tau ^{\prime },\tau) = {\bf A}\lbrace \bm{\beta }_0(\tau ^{\prime })\rbrace ^{-1} E\lbrace {\bm xi}_1(\tau ^{\prime }) {\bm xi}_1(\tau)^T \rbrace ({\bf A}\lbrace \bm{\beta }_0(\tau)\rbrace ^{-1})^T. \end{equation*}

Proof.From Fleming and Harrington (1991) and Gómez, Julià, and Utzet (1994), we obtain

\begin{equation*} \sup _{t\in [0,v)}{\left\Vert n^{1/2}\lbrace \hat{S}_R(t) - S_R(t) \rbrace - n^{-1/2}\sum _{i=1}^{n}S_R(t) \int _{-\infty }^t y(u)^{-1}dM_i^R(u) \right\Vert} \rightarrow 0 \end{equation*}

and

\begin{eqnarray*} && \sup _{t\in [0,v)} {\left\Vert n^{1/2}\lbrace \hat{S}_L(t) - S_L(t) \rbrace - n^{-1/2} \sum _{i=1}^{n}\lbrace 1-S_L(t)\rbrace \right.}\\ &&\quad \times {\left. \int _{t}^\infty \lbrace 1-y(u)\rbrace ^{-1}dM_i^L(u)\right\Vert} \rightarrow 0. \end{eqnarray*}

Using similar empirical process arguments in the proof of Theorem 1, it can be easily seen that

\begin{equation*} \sup _{\bm{\beta }\in \mathbb {R}^{p},\nobreakspace t\in [0,v)} {\left\Vert n^{-1} \sum _{i=1}^n \dfrac{\delta _{1i}}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} {\bf x}_iY_i(t)I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }) - R_1(\bm{\beta },t) \right\Vert} \rightarrow 0, \end{equation*}

where

R_1(\bm{\beta },t) = E[\delta _1{\bf x}y(t)I(\tilde{T} \le {\bf x}^T\bm{\beta }) \lbrace S_R(\tilde{T}) - S_L(\tilde{T})\rbrace ^{-1}]

, and

\begin{eqnarray*} && \sup _{\bm{\beta }\in \mathbb {R}^{p},\nobreakspace t\in [0,v)} {\left\Vert n^{-1} \sum _{i=1}^n \dfrac{\delta _{1i}}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} {\bf x}_i \right.}\\ &&\quad { \lbrace 1-Y_i(t)\rbrace I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }) - R_2(\bm{\beta },t)\Bigg \Vert} \rightarrow 0, \end{eqnarray*}

where

R_2(\bm{\beta },t) = E[\delta _1{\bf x}\lbrace 1-y(t)\rbrace I(\tilde{T} \le {\bf x}^T\bm{\beta }) \lbrace S_R(\tilde{T}) - S_L(\tilde{T})\rbrace ^{-1}]

Now, we use $\approx$ for asymptotic equivalence uniformly in $\tau \in [\tau _L,\tau _R]$ . It follows from standard asymptotic arguments that

\begin{align*} & {\bf U}_n(\bm{\beta }_0,\tau) ={\bf U}_n^S(\bm{\beta }_0,\tau) + \lbrace {\bf U}_{n}(\bm{\beta }_0,\tau) - {\bf U}_n^S(\bm{\beta }_0,\tau)\rbrace \\ &\quad= n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i}(\tau) \\ &\quad + n^{-1/2}\sum _{i=1}^n \dfrac{ (S_R(\tilde{T}_i)-\hat{S}_R(\tilde{T}_i)) + (\hat{S}_L(\tilde{T}_i) - S_L(\tilde{T}_i)) }{(\hat{S}_R(\tilde{T}_i)-\hat{S}_L(\tilde{T}_i)) (S_R(\tilde{T}_i) - S_L(\tilde{T}_i))} \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0) \\ &\quad\approx n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i}(\tau) \\ &\quad - n^{-1}\sum _{i=1}^n \dfrac{n^{-1/2} \sum _{j=1}^{n}\int _{-\infty }^\infty Y_i(u) dM_j^R(u)/y(u)} {S_R(\tilde{T}_i)- S_L(\tilde{T}_i)} \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0)\\ &\quad + n^{-1}\sum _{i=1}^n \dfrac{n^{-1/2} \sum _{j=1}^{n}\int _{-\infty }^\infty \lbrace 1-Y_i(u)\rbrace dM_j^L(u)/ \lbrace 1-y(u)\rbrace }{S_R(\tilde{T}_i)- S_L(\tilde{T}_i)} \\ &\quad \ \ \ \times \delta _{1i}{\bf x}_i I(\tilde{T}_i\le {\bf x}_i^T\bm{\beta }_0)\\ &\quad\approx n^{-1/2} \sum _{i=1}^n{\bm xi}_{1,i} (\tau)\\ &\quad - n^{-1/2}\sum _{i=1}^n\int _{-\infty }^\infty {\left\lbrace \frac{1}{n} \sum _{j=1}^n\dfrac{\delta _{1j}{\bf x}_jY_j(u)I(\tilde{T}_j\le {\bf x}_j^T\bm{\beta }_0)}{S_R(\tilde{T}_j)-S_L(\tilde{T}_j)}\right\rbrace} \dfrac{dM_i^R(u)}{y(u)} \\ & \quad + n^{-1/2}\sum _{i=1}^n\int _{-\infty }^\infty {\left\lbrace \frac{1}{n} \sum _{j=1}^n\dfrac{\delta _{1j}{\bf x}_j\lbrace 1-Y_j(u)\rbrace I(\tilde{T}_j\le {\bf x}_j^T\bm{\beta }_0)}{S_R(\tilde{T}_j)-S_L(\tilde{T}_j)}\right\rbrace} \dfrac{dM_i^L(u)}{1-y(u)} \\ &\quad\approx n^{-1/2}\sum _{i=1}^{n}{\bm xi}_{1,i} (\tau) -n^{-1/2}\sum _{i=1}^{n} \int _{-\infty }^\infty q_1(\bm{\beta }_0,u) dM_i^R(u)\\ &\quad + n^{-1/2}\sum _{i=1}^{n} \int _{-\infty }^\infty q_2(\bm{\beta }_0,u) dM_i^L(u)\\ &\quad= n^{-1/2}\sum _{i=1}^{n}\lbrace {\bm xi}_{1,i}(\tau) + {\bm xi}_{2,i}(\tau) + {\bm xi}_{3,i}(\tau)\rbrace \equiv n^{-1/2}\sum _{i=1}^n{\bm xi}_i(\tau), \end{align*}

where

{\bm xi}_{2i}(\tau) = \int _{-\infty }^\infty q_1(u) dM_i^R(u)

with

q_1(u) = -{R_1(\bm{\beta }_0,u)}/{y(u)}

{\bm xi}_{3,i}(\tau) = \int _{-\infty }^\infty q_2(u) dM_i^L(u)

with

q_2(u) = {Q_2}(\bm{\beta }_0,u)/\lbrace 1-y(u)\rbrace

and

{\bm xi}_i(\tau) = {\bm xi}_{1,i}(\tau) + {\bm xi}_{2,i}(\tau) + {\bm xi}_{3,i}(\tau)

We claim that function classes $\mathcal {A}_1 = \lbrace {\bm xi}_{1,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace$ , $\mathcal {A}_2 = \lbrace {\bm xi}_{2,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace$ , and $\mathcal {A}_3 = \lbrace {\bm xi}_{3,i}(\tau),\tau \in [\tau _L,\tau _R] \rbrace$ are Donsker. First, given the Lipschitz continuity of $\bm{\beta }_0$ implied by condition (C3), we show that $\mathcal {A}_1$ is Donsker by applying similar arguments of $\mathcal {A}$ and using the fact that the permanence of Donsker property in Lipschitz transformation (Theorem 2.10.6 of van der Vaart and Wellner 1996). Note that $\int _{-\infty }^\infty q_1(u)dM_i^R(u)$ and $\int _{-\infty }^\infty q_2(u)dM_i^L(u)$ are Lipschitz in $\bm{\beta }$ due to convexity in $\bm{\beta }$ . The Donsker property of $\mathcal {A}_2$ and $\mathcal {A}_3$ then follows similarly. Therefore, from the Donsker theorem (Section 2.8.2 of van der Vaart and Wellner 1996), ${\bf U}_n(\bm{\beta }_0,\tau)$ converges weakly to a zero-mean Gaussian process with covariance matrix $\bm \Gamma (\tau ^{\prime },\tau) = E \lbrace {\bm xi}_1(\tau ^{\prime }) {\bm xi}_1(\tau)\rbrace$ .

Finally, we can write ${\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau) = \text{(I) + (II)}$ , where

\begin{align*} \text{(I)} &= n^{-1/2}\sum _{i=1}^n \frac{\delta _{1i}}{ S_R(\tilde{T}_i)-S_L(\tilde{T}_i) } {\bf x}_i \lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }}) -I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0) \rbrace; \\ \text{(II)} &= n^{-1/2}\sum _{i=1}^n \delta _{1i} {\left\lbrace \frac{1}{\hat{S}_R(\tilde{T}_i) - \hat{S}_L(\tilde{T}_i)} - \dfrac{1}{S_R(\tilde{T}_i) - S_L(\tilde{T}_i)} \right\rbrace} {\bf x}_i \lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }})\\&\quad -\,I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0) \rbrace . \end{align*}

From Lemma A.1, the uniform consistency of

\hat{\bm{\beta }}

, and the fact that

E[\delta _{1i}/\lbrace S_R(\tilde{T}_i)-S_L(\tilde{T}_i)\rbrace]=1

, we observe (I)

\approx n^{1/2} \lbrace {\bf U}_0(\hat{\bm{\beta }},\tau) - {\bf U}_0(\bm{\beta }_0,\tau)\rbrace

. Note that

\sup _i| \lbrace \hat{S}_R(\tilde{T}_i) - \hat{S}_L(\tilde{T}_i) \rbrace ^{-1} - \lbrace S_R(\tilde{T}_i)- S_L(\tilde{T}_i)\rbrace ^{-1} | = o_p(n^{-1/2 + r})

for any

0&lt;r&lt;1/2

, and

\sup _i\Vert {\bf x}_i\lbrace I(\tilde{T}_i \le {\bf x}_i^T\hat{\bm{\beta }}) -I(\tilde{T}_i \le {\bf x}_i^T\bm{\beta }_0)\rbrace \Vert &lt; \infty

by condition (C2). The above properties and the uniform consistencies of

\hat{S}_R(\cdot)

and

\hat{S}_L(\cdot)

S_R(\cdot)

and

S_L(\cdot)

, respectively, imply that

{\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau)

is dominated by (I). Taylor expansion of

{\bf U}_0(\bm{\beta },\tau)

around

\bm{\beta }= \bm{\beta }_0

and the uniform consistency of

\hat{\bm{\beta }}

for

\bm{\beta }_0

give that

\begin{equation*} {\bf U}_n(\hat{\bm{\beta }},\tau) - {\bf U}_n(\bm{\beta }_0,\tau) = \lbrace {\bf A}(\bm{\beta }_0) + r_n(\tau) \rbrace n^{1/2} (\hat{\bm{\beta }}- \bm{\beta }_0), \end{equation*}

where

\sup _\tau \Vert r_n(\tau)\Vert \rightarrow 0

. Given that

{\bf U}_n(\hat{\bm{\beta }},\tau) = o_p(n^{-1/2})

, this further implies that

n^{1/2}(\hat{\bm{\beta }}-\bm{\beta }_0) = -{\bf A}(\bm{\beta }_0)^{-1} {\bf U}_n(\bm{\beta }_0,\tau) + r_n^*(\tau)

, where

\sup _\tau \Vert r_n^*(\tau)\Vert \rightarrow 0

. It then follows that

\begin{equation} n^{1/2} (\hat{\bm{\beta }}-\bm{\beta }_0)\approx n^{-1/2}\sum _{i=1}^n {\bf A}(\bm{\beta }_0)^{-1} {\bm xi}_i(\tau). \end{equation}

(A.3)

Weak convergence of

n^{1/2}(\hat{\bm{\beta }}-\bm{\beta }_0)

can be established, because

\mathcal {A}_1,\mathcal {A}_2

, and

\mathcal {A}_3

are Donsker classes, and the Donsker property is preserved under addition and subtraction (Theorem 2.10.6 of van der Vaart and Wellner 1996). We have established the asymptotic results for DC data, and extending these results to PIC data is straightforward by simply adjusting the weighting scheme. By referring to (4) and (5), the transformation

\delta _{1i} \lbrace S_R(\tilde{T}_i)-S_L(\tilde{T}_i) \rbrace ^{-1}

into

{\Delta _{i}} \lbrace {\hat{F}_{V}(\tilde{T}_i|{\bf x}_i) + \hat{S}_{U}(\tilde{T}_i|{\bf x}_i)} \rbrace ^{-1}

in Theorems A.1 and A.2 can support the asymptotic results for PIC data.

\Box

Open Research

Open Research Badges

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

This article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available in the Supporting Information section.

This article has earned an open data badge “Reproducible Research” for making publicly available the code necessary to reproduce the reported results. The results reported in this article could fully be reproduced.

Supporting Information

References

Alexander, K. S. 1984. “Probability Inequalities for Empirical Processes and a Law of the Iterated Logarithm.” Annals of Probability 12, no. 4: 1041–1067.
10.1214/aop/1176993141
Google Scholar
Bang, H., and A. A. Tsiatis. 2002. “Median Regression With Censored Cost Data.” Biometrics 58, no. 3: 643–649.
10.1111/j.0006-341X.2002.00643.x
PubMed Web of Science® Google Scholar
Beran, R. 1981. Nonparametric Regression With Randomly Censored Survival Data. Technical Report. University of California, Berkeley.
Google Scholar
Bogaerts, K., A. Komárek, and E. Lesaffre. 2017. Survival Analysis With Interval Censored Data: A Practical Approach With Examples in R, SAS, and BUGS. New York: Chapman and Hall/CRC.
10.1201/9781315116945
Google Scholar
Cai, T., and S. Cheng. 2004. “Semiparametric Regression Analysis for Doubly Censored Data.” Biometrika 91, no. 2: 277–290.
10.1093/biomet/91.2.277
Google Scholar
Chiou, S. H., S. Kang, and J. Yan. 2015. “Rank-Based Estimating Equations With General Weight for Accelerated Failure Time Models: An Induced Smoothing Approach.” Statistics in Medicine 34, no. 9: 1495–1510.
10.1002/sim.6415
CAS PubMed Web of Science® Google Scholar
Choi, S., T. Choi, H. Cho, and D. Bandyopadhyay. 2022. “Weighted Least-Squares Regression With Competing Risks Data.” Statistics in Medicine 41, no. 2: 227–241.
10.1002/sim.9232
PubMed Google Scholar
Choi, S., and X. Huang. 2021. “Efficient Inferences for Linear Transformation Models With Doubly Censored Data.” Communications in Statistics–Theory and Methods 50, no. 9: 2188–2200.
10.1080/03610926.2019.1662046
Google Scholar
Choi, S., S. Kang, and X. Huang. 2018. “Smoothed Quantile Regression Analysis of Competing Risks.” Biometrical Journal 60, no. 5: 934–946.
10.1002/bimj.201700104
PubMed Web of Science® Google Scholar
Choi, T., A. K. Kim, and S. Choi. 2021. “Semiparametric Least-Squares Regression With Doubly-Censored Data.” Computational Statistics & Data Analysis 164: 107306.
10.1016/j.csda.2021.107306
Google Scholar
Choi, T., S. Park, H. Cho, and S. Choi. 2024. “Interval-Censored Linear Quantile Regression.” Journal of Computational and Graphical Statistics In Press. https://doi.org/10.1080/10618600.2024.2365740.
10.1080/10618600.2024.2365740
Google Scholar
Cong, X. J., G. Yin, and Y. Shen. 2007. “Marginal Analysis of Correlated Failure Time Data With Informative Cluster Sizes.” Biometrics 63, no. 3: 663–672.
10.1111/j.1541-0420.2006.00730.x
PubMed Web of Science® Google Scholar
De Backer, M., A. E. Ghouch, and I. Van Keilegom. 2019. “An Adapted Loss Function for Censored Quantile Regression.” Journal of the American Statistical Association 114, no. 527: 1126–1137.
10.1080/01621459.2018.1469996
CAS Web of Science® Google Scholar
Fleming, T. R., and D. P. Harrington. 1991. Counting Processes and Survival Analysis. New York: John Wiley & Sons.
Google Scholar
Frumento, P. 2022. “A Quantile Regression Estimator for Interval-Censored Data.” International Journal of Biostatistics 19, no. 1: 81–96.
10.1515/ijb-2021-0063
PubMed Google Scholar
Gao, F., D. Zeng, and D.-Y. Lin. 2017. “Semiparametric Estimation of the Accelerated Failure Time Model With Partly Interval-Censored Data.” Biometrics 73, no. 4: 1161–1168.
10.1111/biom.12700
PubMed Web of Science® Google Scholar
Gómez, G., O. Julià, and F. Utzet. 1994. “Asymptotic Properties of the Left Kaplan-Meier Estimator.” Communications in Statistics–Theory and Methods 23, no. 1: 123–135.
10.1080/03610929408831243
Google Scholar
Gorfine, M., Y. Goldberg, and Y. Ritov. 2017. “A Quantile Regression Model for Failure-Time Data With Time-Dependent Covariates.” Biostatistics 18, no. 1: 132–146.
10.1093/biostatistics/kxw036
PubMed Web of Science® Google Scholar
Groeneboom, P., and K. Hendrickx. 2018. “Current Status Linear Regression.” Annals of Statistics 46, no. 4: 1415–1444.
10.1214/17-AOS1589
Web of Science® Google Scholar
Gu, M. G., and C.-H. Zhang. 1993. “Asymptotic Properties of Self-Consistent Estimators Based on Doubly Censored Data.” Annals of Statistics 21, no. 2: 611–624.
10.1214/aos/1176349140
Google Scholar
Ishwaran, H., U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. 2008. “Random Survival Forests.” Annals of Applied Statistics 2, no. 3: 841–860.
10.1214/08-AOAS169
Web of Science® Google Scholar
Ji, S., L. Peng, Y. Cheng, and H. Lai. 2012. “Quantile Regression for Doubly Censored Data.” Biometrics 68, no. 1: 101–112.
10.1111/j.1541-0420.2011.01667.x
CAS PubMed Web of Science® Google Scholar
Kim, M. Y., V. G. De Gruttola, and S. W. Lagakos. 1993. “Analyzing Doubly Censored Data With Covariates, With Application to Aids.” Biometrics 49, no. 1: 13–22.
10.2307/2532598
CAS PubMed Web of Science® Google Scholar
Koenker, R. 2005. Quantile Regression. Cambridge: Cambridge University Press.
10.1017/CBO9780511754098
Web of Science® Google Scholar
Komárek, A., and E. Lesaffre. 2008. “Bayesian Accelerated Failure Time Model With Multivariate Doubly Interval-Censored Data and Flexible Distributional Assumptions.” Journal of the American Statistical Association 103, no. 482: 523–533.
10.1198/016214507000000563
CAS Web of Science® Google Scholar
Lai, T. L., and Z. Ying. 1988. “Stochastic Integrals of Empirical-Type Processes With Applications to Censored Regression.” Journal of Multivariate Analysis 27, no. 2: 334–358.
10.1016/0047-259X(88)90134-0
Google Scholar
Li, S., T. Hu, P. Wang, and J. Sun. 2018. “A Class of Semiparametric Transformation Models for Doubly Censored Failure Time Data.” Scandinavian Journal of Statistics 45, no. 3: 682–698.
10.1111/sjos.12319
Web of Science® Google Scholar
Lin, G., X. He, and S. Portnoy. 2012. “Quantile Regression With Doubly Censored Data.” Computational Statistics & Data Analysis 56, no. 4: 797–812.
10.1016/j.csda.2011.03.009
Google Scholar
Pan, C., B. Cai, and L. Wang. 2020. “A Bayesian Approach for Analyzing Partly Interval-Censored Data Under the Proportional Hazards Model.” Statistical Methods in Medical Research 29, no. 11: 3192–3204.
10.1177/0962280220921552
PubMed Web of Science® Google Scholar
Pan, W. 2000. “A Two-Sample Test With Interval Censored Data via Multiple Imputation.” Statistics in Medicine 19, no. 1: 1–11.
10.1002/(SICI)1097-0258(20000115)19:1<1::AID-SIM296>3.0.CO;2-Q
CAS PubMed Web of Science® Google Scholar
Peeters, M., T. Price, A. Cervantes, et al. 2010. “Randomized Phase III Study of Panitumumab With Fluorouracil, Leucovorin, and Irinotecan (FOLFIRI) Compared With FOLFIRI Alone as Second-Line Treatment in Patients With Metastatic Colorectal Cancer.” Journal of Clinical Oncology 28, no. 31: 4706–4713.
10.1200/JCO.2009.27.6055
CAS PubMed Web of Science® Google Scholar
Peng, L., and J. P. Fine. 2009. “Competing Risks Quantile Regression.” Journal of the American Statistical Association 104, no. 488: 1440–1453.
10.1198/jasa.2009.tm08228
Web of Science® Google Scholar
Peng, L., and Y. Huang. 2008. “Survival Analysis With Quantile Regression Models.” Journal of the American Statistical Association 103, no. 482: 637–649.
10.1198/016214508000000355
CAS Web of Science® Google Scholar
Ren, J.-J., and M. Gu. 1997. “Regression M-Estimators With Doubly Censored Data.” Annals of Statistics 25, no. 6: 2638–2664.
10.1214/aos/1030741089
Google Scholar
Robins, J. M., and A. Rotnitzky. 1992. “Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers.” In AIDS Epidemiology, 297–331. New York: Springer.
Google Scholar
Son, M., T. Choi, S. J. Shin, Y. Jung, and S. Choi. 2022. “Regularized Linear Censored Quantile Regression.” Journal of the Korean Statistical Society 51: 589–607.
10.1007/s42952-021-00155-z
Google Scholar
Sun, J. 2007. The Statistical Analysis of Interval-Censored Failure Time Data. New York: Springer.
Google Scholar
Turnbull, B. W. 1974. “Nonparametric Estimation of a Survivorship Function With Doubly Censored Data.” Journal of the American Statistical Association 69, no. 345: 169–173.
10.1080/01621459.1974.10480146
Web of Science® Google Scholar
van der Vaart, A., and J. Wellner. 1996. Weak Convergence and Empirical Processes: With Applications to Statistics. New York: Springer.
10.1007/978-1-4757-2545-2
Web of Science® Google Scholar
Varadhan, R., and P. Gilbert. 2010. “BB: An R Package for Solving a Large System of Nonlinear Equations and for Optimizing a High-Dimensional Nonlinear Objective Function.” Journal of Statistical Software 32: 1–26.
Web of Science® Google Scholar
Wang, H. J., and L. Wang. 2009. “Locally Weighted Censored Quantile Regression.” Journal of the American Statistical Association 104, no. 487: 1117–1128.
10.1198/jasa.2009.tm08230
CAS Web of Science® Google Scholar
Wang, Y.-G., and Y. Zhao. 2008. “Weighted Rank Regression for Clustered Data Analysis.” Biometrics 64, no. 1: 39–45.
10.1111/j.1541-0420.2007.00842.x
CAS PubMed Web of Science® Google Scholar
Yang, X., N. N. Narisetty, and X. He. 2018. “A New Approach to Censored Quantile Regression Estimation.” Journal of Computational and Graphical Statistics 27, no. 2: 417–425.
10.1080/10618600.2017.1385469
Web of Science® Google Scholar
Yuen, K.-C., J. Shi, and L. Zhu. 2006. “A K-Sample Test With Interval Censored Data.” Biometrika 93, no. 2: 315–328.
10.1093/biomet/93.2.315
Google Scholar
Zhang, C.-H., and X. Li. 1996. “Linear Regression With Doubly Censored Data.” Annals of Statistics 24, no. 6: 2720–2743.
10.1214/aos/1032181177
Google Scholar
Zhang, J., and D. F. Heitjan. 2006. “A Simple Local Sensitivity Analysis Tool for Nonignorable Coarsening: Application to Dependent Censoring.” Biometrics 62, no. 4: 1260–1268.
10.1111/j.1541-0420.2006.00580.x
CAS PubMed Web of Science® Google Scholar
Zhou, X., Y. Feng, and X. Du. 2017. “Quantile Regression for Interval Censored Data.” Communications in Statistics–Theory and Methods 46, no. 8: 3848–3863.
10.1080/03610926.2015.1073317
Google Scholar
Zou, H., and M. Yuan. 2008. “Composite Quantile Regression and the Oracle Model Selection Theory.” Annals of Statistics 36, no. 3: 1108–1126.
10.1214/07-AOS507
Web of Science® Google Scholar

Volume66, Issue8

December 2024

e70001

This article also appears in:

"CEN 2023"

Filename	Description
bimj70001-sup-0001-DataCode.zip4.1 MB	Supporting Information
bimj70001-sup-0002-SuppMat.pdf333.2 KB	Supporting Information
bimj70001-sup-0003-SuppMat.tex36.9 KB	Supporting Information

Inverse-Weighted Quantile Regression With Partially Interval-Censored Data

ABSTRACT

1 Introduction

2 Model and Estimation

2.1 Statistical Model for DC and PIC Data

DC Data:

PIC) Data:

2.2 Estimation

2.3 Asymptotic Results

2.4 Variance Estimation via Induced Smoothing

3 Augmentation-Based Estimation

4 Extension to Multivariate DC Data

5 Simulation Results

5.1 Univariate Partially Interval-Censored Data

5.2 Multivariate Partially Interval-Censored Data

6 Application: mCRC Data

7 Discussion

Acknowledgments

Conflicts of Interest

Appendix A: Asymptotic Results

Open Research

Open Research Badges

Data Availability Statement

Supporting Information

References

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Inverse-Weighted Quantile Regression With Partially Interval-Censored Data

ABSTRACT

1 Introduction

2 Model and Estimation

2.1 Statistical Model for DC and PIC Data

DC Data:

PIC) Data:

2.2 Estimation

2.3 Asymptotic Results

2.4 Variance Estimation via Induced Smoothing

3 Augmentation-Based Estimation

4 Extension to Multivariate DC Data

5 Simulation Results

5.1 Univariate Partially Interval-Censored Data

5.2 Multivariate Partially Interval-Censored Data

6 Application: mCRC Data

7 Discussion

Acknowledgments

Conflicts of Interest

Appendix A: Asymptotic Results

Open Research

Open Research Badges

Data Availability Statement

Supporting Information

References

Figures

References

Related

Information