Title: | Recursive Partitioning for Longitudinal Data and Right Censored Data Using Baseline Covariates |
---|---|
Description: | Constructs tree for continuous longitudinal data and survival data using baseline covariates as partitioning variables according to the 'LongCART' and 'SurvCART' algorithm, respectively. Later also included functions to calculate conditional power and predictive power of success based on interim results and probability of success for a prospective trial. |
Authors: | Madan G Kundu |
Maintainer: | Madan G Kundu <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.2 |
Built: | 2024-11-13 04:13:55 UTC |
Source: | https://github.com/cran/LongCART |
ACTG 175 was a randomized clinical trial to compare monotherapy with zidovudine or didanosine with combination therapy with zidovudine and didanosine or zidovudine and zalcitabine in adults infected with the human immunodeficiency virus type I whose CD4 T cell counts were between 200 and 500 per cubic millimeter.
data(ACTG175)
data(ACTG175)
A data frame with 6417 observations from 2139 patients on the following 24 variables.
patient ID number
age in years at baseline
weight in kg at baseline
hemophilia (0=no, 1=yes)
homosexual activity (0=no, 1=yes)
history of intravenous drug use (0=no, 1=yes)
Karnofsky score (on a scale of 0-100)
non-zidovudine antiretroviral therapy prior to initiation of study treatment (0=no, 1=yes)
zidovudine use in the 30 days prior to treatment initiation (0=no, 1=yes)
zidovudine use prior to treatment initiation (0=no, 1=yes)
number of days of previously received antiretroviral therapy
race (0=white, 1=non-white)
gender (0=female, 1=male)
antiretroviral history (0=naive, 1=experienced)
antiretroviral history stratification (1:antiretroviral naive, 2:greater than 1 but less than 52 weeks of prior antiretroviral therapy, 3: greater than 52 weeks)
symptomatic indicator (0=asymptomatic, 1=symptomatic)
treatment indicator (0=zidovudine only, 1=other therapies)
indicator of off-treatment before 96 weeks (0=no,1=yes)
missing CD4 T cell count at 96 weeks (0=missing, 1=observed)
indicator of observing the event in days
number of days until the first occurrence of: (i) a decline in CD4 T cell count of at least 50 (ii) an event indicating progression to AIDS, or (iii) death.
treatment arm (0=zidovudine, 1=zidovudine and didanosine, 2=zidovudine and zalcitabine, 3=didanosine)
time in weeks
CD4 T cell count
Hammer, S.M., et al. (1996), A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335:1081-1090.
A data frame containing the observations from the GBSG2 study.
data(GBSG2)
data(GBSG2)
A data frame with 686 observations on the following 10 variables.
horTh
hormonal therapy, a factor with levels no
yes
age
age in years
menostat
menopausal status, a factor with levels Pre
Post
tsize
tumor size (in mm)
tgrade
an ordered factor with levels I
< II
< III
pnodes
number of positive nodes
progrec
progesterone receptor (in fmol).
estrec
estrogen receptor (in fmol).
time
recurrence free survival time (in days).
cens
censoring indicator (0- censored, 1- event).
Schumacher M, Bastert G, Bojar H, Huebner K, Olschewski M, Sauerbrei W, Schmoor C, Beyerle C, Neumann RL, Rauschecker HF. Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. Journal of Clinical Oncology. 1994 Oct;12(10):2086-93.
data(GBSG2)
data(GBSG2)
Generates KM plot for sub-groups (i.e., terminal nodes) associated with survival tree generated by SurvCART()
KMPlot(x, type = 1, overlay=TRUE, conf.type="log-log", mfrow=NULL, ...)
KMPlot(x, type = 1, overlay=TRUE, conf.type="log-log", mfrow=NULL, ...)
x |
a fitted object of class |
type |
1 for KM plot of survival probabilities, 2 for KM plot of censoring probabilities |
overlay |
Logical inputs ( |
conf.type |
One of |
mfrow |
Desired frame for fitting multiple plots. Default option is to include plots for all subgroups in the same frame. This input is ignored when |
... |
arguments to be passed to or from other methods. |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
text
, plot
, SurvCART
, StabCat.surv
, StabCont.surv
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE) #Plot KM plot of survival probabilities for sub-groups identified by tree KMPlot(out, xscale=365.25, type=1) KMPlot(out, xscale=365.25, type=1, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.") #Plot KM plot of censoring probabilities for sub-groups identified by tree KMPlot(out, xscale=365.25, type=2) KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Censoring prob.")
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE) #Plot KM plot of survival probabilities for sub-groups identified by tree KMPlot(out, xscale=365.25, type=1) KMPlot(out, xscale=365.25, type=1, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.") #Plot KM plot of censoring probabilities for sub-groups identified by tree KMPlot(out, xscale=365.25, type=2) KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Censoring prob.")
Recursive partitioning for linear mixed effects model with continuous univariate response variables per LonCART algorithm based on baseline partitioning variables (Kundu and Harezlak, 2019).
LongCART(data, patid, fixed, gvars, tgvars, minsplit=40, minbucket=20, alpha=0.05, coef.digits=2, print.lme=FALSE)
LongCART(data, patid, fixed, gvars, tgvars, minsplit=40, minbucket=20, alpha=0.05, coef.digits=2, print.lme=FALSE)
data |
name of the dataset. It must contain variable specified for |
patid |
name of the subject id variable. |
fixed |
a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a |
gvars |
list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through |
tgvars |
types (categorical or continuous) of partitioning variables specified in |
minsplit |
the minimum number of observations that must exist in a node in order for a split to be attempted. |
minbucket |
he minimum number of observations in any terminal node. |
alpha |
alpha (i.e., nominal type I error) level for parameter instability test |
coef.digits |
decimal points for displaying coefficients in the tree structure. |
print.lme |
if |
Construct regression tree based on heterogeneity in linear mixed effects models of following type:
Y_i(t)= W_i(t)theta + b_i + epsilon_{it}
where W_i(t)
is the design matrix, theta
is the parameter associated with W_i(t)
and b_i
is the random intercept.
Also, epsilon_{it} ~ N(0,sigma ^2)
and b_i ~ N(0, sigma_u^2)
.
Treeout |
contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of |
p |
number of fixed parameters |
AIC.tree |
AIC of the tree-structured model |
AIC.root |
AIC at the root node (i.e., without tree structure) |
improve.AIC |
improvement in AIC due to tree structure (AIC.tree - AIC.root) |
logLik.tree |
log-likelihood of the tree-structured model |
logLik.root |
log-likelihood at the root node (i.e., without tree structure) |
Deviance |
2*(logLik.tree-logLik.root) |
LRT.df |
degrees of freedom for likelihood ratio test comparing tree-structured model with the model at root node. |
LRT.p |
p-value for likelihood ratio test comparing tree-structured model with the model at root node. |
nodelab |
List of subgroups or terminal nodes with their description |
varnam |
List of splitting variables |
data |
the dataset originally supplied |
patid |
the patid variable originally supplied |
fixed |
the fixed part of the model originally supplied |
frame |
rpart compatible object |
splits |
rpart compatible object |
cptable |
rpart compatible object |
functions |
rpart compatible object |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
plot
, text
, ProfilePlot
, StabCat
, StabCont
, predict
#--- Get the data data(ACTG175) #-----------------------------------------------# # model: cd4~ time + subject(random) # #-----------------------------------------------# #--- Run LongCART() gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #--- Plot tree par(mfrow=c(1,1)) par(xpd = TRUE) plot(out1, compress = TRUE) text(out1, use.n = TRUE) #--- Plot longitudinal profiles of subgroups ProfilePlot(x=out1, timevar="time") #-----------------------------------------------# # model: cd4~ time+ time^2 + subject(random) # #-----------------------------------------------# ACTG175$time2<- ACTG175$time^2 out2<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) par(mfrow=c(1,1)) par(xpd = TRUE) plot(out2, compress = TRUE) text(out2, use.n = TRUE) ProfilePlot(x=out2, timevar="time", timevar.power=c(1,2)) #--------------------------------------------------------# # model: cd4~ time+ time^2 + subject(random) + karnof # #--------------------------------------------------------# out3<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2 + karnof, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) par(mfrow=c(1,1)) par(xpd = TRUE) plot(out3, compress = TRUE) text(out3, use.n = TRUE) #the value of the covariate karnof is set at median by default ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA)) #the value of the covariate karnof is set at 120 ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA), covariate.val=c(NA, NA, 120))
#--- Get the data data(ACTG175) #-----------------------------------------------# # model: cd4~ time + subject(random) # #-----------------------------------------------# #--- Run LongCART() gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #--- Plot tree par(mfrow=c(1,1)) par(xpd = TRUE) plot(out1, compress = TRUE) text(out1, use.n = TRUE) #--- Plot longitudinal profiles of subgroups ProfilePlot(x=out1, timevar="time") #-----------------------------------------------# # model: cd4~ time+ time^2 + subject(random) # #-----------------------------------------------# ACTG175$time2<- ACTG175$time^2 out2<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) par(mfrow=c(1,1)) par(xpd = TRUE) plot(out2, compress = TRUE) text(out2, use.n = TRUE) ProfilePlot(x=out2, timevar="time", timevar.power=c(1,2)) #--------------------------------------------------------# # model: cd4~ time+ time^2 + subject(random) + karnof # #--------------------------------------------------------# out3<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2 + karnof, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) par(mfrow=c(1,1)) par(xpd = TRUE) plot(out3, compress = TRUE) text(out3, use.n = TRUE) #the value of the covariate karnof is set at median by default ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA)) #the value of the covariate karnof is set at 120 ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA), covariate.val=c(NA, NA, 120))
Plots an SurvCART or LongCART object on the current graphics device.
## S3 method for class 'SurvCART' plot(x, uniform = FALSE, branch = 1, compress = FALSE, nspace = branch, margin = 0, minbranch = 0.3, ...) ## S3 method for class 'LongCART' plot(x, uniform = FALSE, branch = 1, compress = FALSE, nspace = branch, margin = 0, minbranch = 0.3, ...)
## S3 method for class 'SurvCART' plot(x, uniform = FALSE, branch = 1, compress = FALSE, nspace = branch, margin = 0, minbranch = 0.3, ...) ## S3 method for class 'LongCART' plot(x, uniform = FALSE, branch = 1, compress = FALSE, nspace = branch, margin = 0, minbranch = 0.3, ...)
x |
a fitted object of class |
uniform |
similar to |
branch |
similar to |
compress |
similar to |
nspace |
similar to |
margin |
similar to |
minbranch |
similar to |
... |
arguments to be passed to or from other methods. |
This function is a method for the generic function plot, for objects of class
SurvCART
. The y-coordinate of the top node of the tree will always be 1.
The coordinates of the nodes are returned as a list, with components x
and y
.
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", event.ind=1, gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE)
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", event.ind=1, gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE)
This function can be used to determine probability of trial success and clinical success based on the prior distribution for each of continuous, binary and time-to-event endpoints. The calculation is carried out assuming normal distribution for estimated parameter and normal prior distribution.
PoS(type, nsamples, null.value = NULL, alternative = "greater", N = NULL, D = NULL, a = 1, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, se.exp = NULL, meandiff.prior = NULL, mean.prior = NULL, sd.prior = NULL, propdiff.prior = NULL, prop.prior = NULL, hr.prior = NULL, D.prior = NULL)
PoS(type, nsamples, null.value = NULL, alternative = "greater", N = NULL, D = NULL, a = 1, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, se.exp = NULL, meandiff.prior = NULL, mean.prior = NULL, sd.prior = NULL, propdiff.prior = NULL, prop.prior = NULL, hr.prior = NULL, D.prior = NULL)
type |
Type of the endpoint. It could be |
nsamples |
Number of samples. For continuous and binary case, it can be 1 or 2. For survival endpoint, it can be only 2. |
null.value |
The specified value under null hypothesis. Default is 0 for continuous and binomial case and 1 for survival case. |
alternative |
Direction of alternate hypothesis. Can be "greater" or "less". Default is "less" for test of HR and "greater" otherwise. |
N |
Total sample size at final analysis. Cannot be missing for continuous and binary endpoint. |
D |
Total number of events at final analysis. Cannot be missing for survival endpoint. |
a |
Allocation ratio in two sample case. |
succ.crit |
Specify "trial" for trial success (i.e., null hypothesis is rejected at final analysis) or "clinical" for clinical success (i.e., estimated value at the final analysis is greater than clinically meaningful value as specified under |
Z.crit.final |
The rejection boundary at final analysis in Z-value scale. Either |
alpha.final |
The rejection boundary at final analysis in alpha (1-sided) scale (e.g., 0.025). Either |
clin.succ.threshold |
Clinically meaningful value. Required when |
se.exp |
Expected standard error to be observed in the study. Must be specified in continuous case and two-sample binary case. |
meandiff.prior |
Mean value of prior distribution for mean difference. Relevant for two-sample continuous case. |
mean.prior |
Mean value of prior distribution for mean. Relevant for one-sample continuous case. |
sd.prior |
Standard deviation of prior distribution for mean difference (2-sample continuous case) or mean (1-sample continuous case) or prop (2-sample binary case) or difference of proportion (1-sample binary case) or log(HR) (2 sample survival case). |
propdiff.prior |
Mean value of prior distribution for difference in proportion. Relevant for two-sample binomial case. |
prop.prior |
Mean value of prior distribution for proportion. Relevant for one-sample binomial case. |
hr.prior |
Mean value of prior distribution for hazards ratio (HR). Relevant for two-sample survival case. |
D.prior |
Ignored if |
This function can be used to determine probability of success (PoS) for a prospective trial for each of continuous (one-sample or two-samples), binary (one-sample or two-samples) and time-to-event endpoints (two-samples). The PoS is calculated based on the prior distribution and expected standard error in the estimate in trial. The calculation PoS is carried out assuming normal distribution for estimated parameter and normal prior distribution. This function can be used to determine clinical success (succ.crit="clinical"
) and trial success (succ.crit="trial"
). For clinical success, clin.succ.threshold
must be specified. For trial success, Z.crit.final or alpha.final
must be specified.
Madan Gopal Kundu <[email protected]>
Kundu, M. G., Samanta, S., and Mondal, S. (2021). An introduction to the determination of the probability of a successful trial: Frequentist and Bayesian approaches. arXiv preprint arXiv:2102.13550.
succ_ia_betabinom_one
, succ_ia_betabinom_two
, succ_ia
#--- Example 1 PoS(type="cont", nsamples=2, null.value=-0.05, alternative="greater", N=1552, a=1, succ.crit="trial", Z.crit.final=1.97, se.exp=0.12*sqrt(1/776 + 1/776), meandiff.prior=0, sd.prior=0.02) #--- Example 2 PoS(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, a=2, succ.crit="trial", Z.crit.final=2.012, se.exp=0.5*sqrt(1/140 + 1/70), propdiff.prior=0.20, sd.prior=sqrt(0.06)) PoS(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, a=2, succ.crit="clinical", clin.succ.threshold =0.15, se.exp=0.5*sqrt(1/140 + 1/70), propdiff.prior=0.20, sd.prior=sqrt(0.06)) #--- Example 4 PoS(type="surv", nsamples=2, null.value=1, alternative="less", D=441, succ.crit="trial", Z.crit.final=1.96, hr.prior=0.71, D.prior=133) PoS(type="surv", nsamples=2, null.value=1, alternative="less", D=441, succ.crit="clinical", clin.succ.threshold =0.8, hr.prior=0.71, D.prior=133)
#--- Example 1 PoS(type="cont", nsamples=2, null.value=-0.05, alternative="greater", N=1552, a=1, succ.crit="trial", Z.crit.final=1.97, se.exp=0.12*sqrt(1/776 + 1/776), meandiff.prior=0, sd.prior=0.02) #--- Example 2 PoS(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, a=2, succ.crit="trial", Z.crit.final=2.012, se.exp=0.5*sqrt(1/140 + 1/70), propdiff.prior=0.20, sd.prior=sqrt(0.06)) PoS(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, a=2, succ.crit="clinical", clin.succ.threshold =0.15, se.exp=0.5*sqrt(1/140 + 1/70), propdiff.prior=0.20, sd.prior=sqrt(0.06)) #--- Example 4 PoS(type="surv", nsamples=2, null.value=1, alternative="less", D=441, succ.crit="trial", Z.crit.final=1.96, hr.prior=0.71, D.prior=133) PoS(type="surv", nsamples=2, null.value=1, alternative="less", D=441, succ.crit="clinical", clin.succ.threshold =0.8, hr.prior=0.71, D.prior=133)
Predicts according to the fitted SurvCART or LongCART tree.
## S3 method for class 'SurvCART' predict(object, newdata, ...) ## S3 method for class 'LongCART' predict(object, newdata, patid, ...)
## S3 method for class 'SurvCART' predict(object, newdata, ...) ## S3 method for class 'LongCART' predict(object, newdata, patid, ...)
object |
a fitted object of class |
newdata |
The dataset for prediction. |
patid |
Variable name containing patient id in the new dataset. Must for prediction based on LongCART object |
... |
Please disregard. |
For prediction based on "SurvCART"
algorithm, the predicted dataset includes the terminal node id the observation belongs to, and the median event and censoring times of the terminal id.
For prediction based on "LongCART"
algorithm, the predicted dataset includes the terminal node id the observation belongs to, the fitted profile, and the predicted value based on the fitted profile. Note that the predicted value does not consider the random effects.
For prediction based on "SurvCART"
algorithm, the dataset adds to the following variables in the new dataset:
node |
Terminal node id the observation belongs to |
median.T |
Median event time of the terminal node id the observation belongs to |
median.C |
Median censoring time of the terminal node id the observation belongs to |
Q1.T |
First quartile for event time of the terminal node id the observation belongs to |
Q1.C |
First quartile for censoring time of the terminal node id the observation belongs to |
Q3.T |
Third quartile for event time of the terminal node id the observation belongs to |
Q3.C |
Third quartile for censoring time of the terminal node id the observation belongs to |
For prediction based on LongCART
algorithm, the dataset adds to the following variables in the new dataset:
node.id |
Terminal node id the observation belongs to |
profile |
The fitted profile of the terminal node id the observation belongs to |
predval |
predicted value based on the fitted profile |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
#--- LongCART example data(ACTG175) gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) pred1<- predict.LongCART(object=out1, newdata=ACTG175, patid="pidnum") head(pred1) #--- SurvCART example data(GBSG2) GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) GBSG2$subjid<- 1:nrow(GBSG2) fit<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) pred2<- predict.SurvCART(object=fit, newdata=GBSG2) head(pred2)
#--- LongCART example data(ACTG175) gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) pred1<- predict.LongCART(object=out1, newdata=ACTG175, patid="pidnum") head(pred1) #--- SurvCART example data(GBSG2) GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) GBSG2$subjid<- 1:nrow(GBSG2) fit<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) pred2<- predict.SurvCART(object=fit, newdata=GBSG2) head(pred2)
Generates population level longitudinal profile plot for each of sub-groups (i.e., terminal nodes) associated with longitudinal tree generated by LongCART()
ProfilePlot(x, timevar, timevar.power=NULL, covariate.val=NULL, xlab=NULL, ylab=NULL, sg.title=4, mfrow=NULL, ...)
ProfilePlot(x, timevar, timevar.power=NULL, covariate.val=NULL, xlab=NULL, ylab=NULL, sg.title=4, mfrow=NULL, ...)
x |
a fitted object of class |
timevar |
Speciy the variable name contining time informaiton in the dataset that was used to fit LongCART object |
timevar.power |
Mandatory when the fixed part of the fitted model contains term as time with power not equal to 1. For example, if fixed part of the model is t + sqrtt + cov1, then specify |
covariate.val |
Specify the covariate values for generation of longitudinal profile plot. Iin the longitudinal profile plot, only time can vary and therefore, and therefore the value for the other covariates are fixed at constant value. This is not needed if the longitudinal model does not contain additional covariate(s). By default, the covariates values are specified at median value over all the datapoint (not at the subject level). For example, if the fixed part of the model is t + cov1, then |
xlab |
Optional label for X-axis |
ylab |
Optional label for Y-axis |
sg.title |
1 for sub-groups' title as |
mfrow |
Desired frame for fitting multiple plots. Default option is to include plots for all subgroups in the same frame. This input is ignored when |
... |
Graphical parameters other than |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
#--- Get the data data(ACTG175) #-----------------------------------------------# # model: cd4~ time + subject(random) # #-----------------------------------------------# #--- Run LongCART() gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #--- Plot longitudinal profiles of subgroups ProfilePlot(x=out1, timevar="time") #-----------------------------------------------# # model: cd4~ time+ time^2 + subject(random) # #-----------------------------------------------# ACTG175$time2<- ACTG175$time^2 out2<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) ProfilePlot(x=out2, timevar="time", timevar.power=c(1,2)) #--------------------------------------------------------# # model: cd4~ time+ time^2 + subject(random) + karnof # #--------------------------------------------------------# out3<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2 + karnof, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #the value of the covariate karnof is set at median by default ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA)) #the value of the covariate karnof is set at 120 ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA), covariate.val=c(NA, NA, 120))
#--- Get the data data(ACTG175) #-----------------------------------------------# # model: cd4~ time + subject(random) # #-----------------------------------------------# #--- Run LongCART() gvars=c("gender", "wtkg", "hemo", "homo", "drugs", "karnof", "oprior", "z30", "zprior", "race", "str2", "symptom", "treat", "offtrt") tgvars=c(0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0) out1<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #--- Plot longitudinal profiles of subgroups ProfilePlot(x=out1, timevar="time") #-----------------------------------------------# # model: cd4~ time+ time^2 + subject(random) # #-----------------------------------------------# ACTG175$time2<- ACTG175$time^2 out2<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) ProfilePlot(x=out2, timevar="time", timevar.power=c(1,2)) #--------------------------------------------------------# # model: cd4~ time+ time^2 + subject(random) + karnof # #--------------------------------------------------------# out3<- LongCART(data=ACTG175, patid="pidnum", fixed=cd4~time + time2 + karnof, gvars=gvars, tgvars=tgvars, alpha=0.05, minsplit=100, minbucket=50, coef.digits=2) #the value of the covariate karnof is set at median by default ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA)) #the value of the covariate karnof is set at 120 ProfilePlot(x=out3, timevar="time", timevar.power=c(1,2, NA), covariate.val=c(NA, NA, 120))
Performs parameter stability test (Kundu and Harezlak, 2019) with categorical partitioning variable to determine whether the parameters of linear mixed effects model remains same across all distinct values of given categorical partitioning variable.
StabCat(data, patid, fixed, splitvar)
StabCat(data, patid, fixed, splitvar)
data |
name of the dataset. It must contain variable specified for |
patid |
name of the subject id variable. |
fixed |
a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a |
splitvar |
the categorical partitioning variable of interest. It's value should not change over time. |
The categorical partitioning variable of interest. It's value should not change over time.
Y_i(t)= W_i(t) theta + b_i + epsilon_{it}
where W_i(t)
is the design matrix, theta
is the parameter associated with
W_i(t)
and b_i
is the random intercept. Also, epsilon_{it} ~ N(0,sigma ^2)
and b_i ~ N(0, sigma_u^2)
. Let X be the baseline categorical partitioning
variable of interest. StabCat()
performs the following omnibus test
H_0:theta_{(g)}=theta_0
vs. H_1: theta_{(g)} ^= theta_0
, for all g
where, theta_{(g)}
is the true value of theta
for subjects with X=C_g
where C_g
is the any value realized by X
.
p |
It returns the p-value for parameter instability test |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
StabCont
, LongCART
, LongCART
, LongCART
#--- Get the data data(ACTG175) #--- Run StabCat() out<- StabCat(data=ACTG175, patid="pidnum", fixed=cd4~time, splitvar="gender") out$pval
#--- Get the data data(ACTG175) #--- Run StabCat() out<- StabCat(data=ACTG175, patid="pidnum", fixed=cd4~time, splitvar="gender") out$pval
Performs parameter stability test (Kundu, 2020) with categorical partitioning variable to determine whether the parameters of exponential time-to-event distribution and exponential censoring distribution remain same across all distinct values of given categorical partitioning variable.
StabCat.surv(data, timevar, censorvar, splitvar, time.dist="exponential", cens.dist="NA", event.ind=1, print=FALSE)
StabCat.surv(data, timevar, censorvar, splitvar, time.dist="exponential", cens.dist="NA", event.ind=1, print=FALSE)
data |
name of the dataset. It must contain variable specified for |
timevar |
name of the variable with follow-up times. |
censorvar |
name of the variable with censoring status. |
time.dist |
name of time-to-event distribution. It can be one of the following distributions: |
cens.dist |
name of censoring distribution. It can be one of the following distributions: |
event.ind |
value of the censoring variable indicating event. |
splitvar |
the categorical partitioning variable of interest. It's value should not change over time. |
print |
if |
StabCat.surv()
performs the following omnibus test
H_0:lambda_{(g)}=lambda_0
vs. H_1: lambda_{(g)} ^= lambda_0
, for all g
where, theta_{(g)}
is the true value of theta
for subjects with X=C_g
. theta
includes all the parameters of time to event distribution and also parameters of censoring distribution, if specified. C_g
is the any value realized by categorical partitioning variable X
.
Exponential distribution: f(t)=lambda*exp(-lambda*t)
Weibull distribution: f(t)=alpha*lambda*t^(alpha-1)*exp(-lambda*t^alpha)
Lognormal distribution: f(t)=(1/t)*(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(log(t)-mu)/sigma^2]
Normal distribution: f(t)=(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(t-mu)/sigma^2]
pval |
p-value for parameter instability test |
type |
1, if event times are more heterogeneous; 2, if censoring times are more hetergeneous. |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
StabCont.surv
, SurvCART
, plot
, text
#--- time-to-event distribution: exponential, censoring distribution: None out1<- StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", event.ind=2) out1$pval #--- time-to-event distribution: weibull, censoring distribution: None StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", time.dist="weibull", event.ind=2) #--- time-to-event distribution: weibull, censoring distribution: exponential StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", time.dist="weibull", cens.dist="exponential", event.ind=2)
#--- time-to-event distribution: exponential, censoring distribution: None out1<- StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", event.ind=2) out1$pval #--- time-to-event distribution: weibull, censoring distribution: None StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", time.dist="weibull", event.ind=2) #--- time-to-event distribution: weibull, censoring distribution: exponential StabCat.surv(data=lung, timevar="time", censorvar="status", splitvar="sex", time.dist="weibull", cens.dist="exponential", event.ind=2)
Performs parameter stability test (Kundu and Harezlak, 2019) with continuous partitioning variable to determine whether the parameters of linear mixed effects model remains same across all distinct values of given continuous partitioning variable.
StabCont(data, patid, fixed, splitvar)
StabCont(data, patid, fixed, splitvar)
data |
name of the dataset. It must contain variable specified for |
patid |
name of the subject id variable. |
fixed |
a two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a |
splitvar |
the continuous partitioning variable of interest. It's value should not change over time. |
The continuous partitioning variable of interest. It's value should not change over time.
where is the design matrix,
theta
is the parameter associated with
and
b_i
is the random intercept. Also,
and
. Let
be the baseline continuous partitioning
variable of interest.
StabCont()
performs the following omnibus test
vs.
, for all g
where, is the true value of
for subjects with
where
is the any value realized by
.
p |
It returns the p-value for parameter instability test |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
StabCont
, LongCART
, plot
, text
#--- Get the data data(ACTG175) #--- Run StabCont() out<- StabCont(data=ACTG175, patid="pidnum", fixed=cd4~time, splitvar="age") out$pval
#--- Get the data data(ACTG175) #--- Run StabCont() out<- StabCont(data=ACTG175, patid="pidnum", fixed=cd4~time, splitvar="age") out$pval
Performs parameter stability test (Kundu, 2020) with continuous partitioning variable to determine whether the parameters of exponential time-to-event distribution and exponential censoring distribution remain same across all distinct values of given continupus partitioning variable.
StabCont.surv(data, timevar, censorvar, splitvar, time.dist="exponential", cens.dist="NA", event.ind=1, print=FALSE)
StabCont.surv(data, timevar, censorvar, splitvar, time.dist="exponential", cens.dist="NA", event.ind=1, print=FALSE)
data |
name of the dataset. It must contain variable specified for |
timevar |
name of the variable with follow-up times. |
censorvar |
name of the variable with censoring status. |
time.dist |
name of time-to-event distribution. It can be one of the following distributions: |
cens.dist |
name of censoring distribution. It can be one of the following distributions: |
event.ind |
value of the censoring variable indicating event. |
splitvar |
the continuous partitioning variable of interest. |
print |
if |
StabCont.surv()
performs the following omnibus test
H_0:theta_{(g)}=theta_0
vs. H_1: theta_{(g)} ^= theta_0
, for all g
where, theta_{(g)}
is the true value of theta
for subjects with X=C_g
. theta
includes all the parameters of time to event distribution and also parameters of censoring distribution, if specified. C_g
is the any value realized by continuous partitioning variable X
.
Exponential distribution: f(t)=lambda*exp(-lambda*t)
Weibull distribution: f(t)=alpha*lambda*t^(alpha-1)*exp(-lambda*t^alpha)
Lognormal distribution: f(t)=(1/t)*(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(log(t)-mu)/sigma^2]
Normal distribution: f(t)=(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(t-mu)/sigma^2]
pval |
p-value for parameter instability test |
type |
1, if event times are more heterogeneous; 2, if censoring times are more hetergeneous. |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
StabCont.surv
, SurvCART
, plot
, text
#--- time-to-event distribution: exponential, censoring distribution: None out1<- StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", event.ind=2) out1$pval #--- time-to-event distribution: weibull, censoring distribution: None StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", time.dist="weibull", event.ind=2) #--- time-to-event distribution: weibull, censoring distribution: exponential StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", time.dist="weibull", cens.dist="exponential", event.ind=2)
#--- time-to-event distribution: exponential, censoring distribution: None out1<- StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", event.ind=2) out1$pval #--- time-to-event distribution: weibull, censoring distribution: None StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", time.dist="weibull", event.ind=2) #--- time-to-event distribution: weibull, censoring distribution: exponential StabCont.surv(data=lung, timevar="time", censorvar="status", splitvar="age", time.dist="weibull", cens.dist="exponential", event.ind=2)
This function can be used to determine conditional power and predictive power for trial success and clinical success based on the interim results and prior distribution for each of continuous, binary and time-to-event endpoints. The calculation is carried out assuming normal distribution for estimated parameter and normal prior distribution.
succ_ia(type, nsamples, null.value = NULL, alternative = NULL, N = NULL, n = NULL, D = NULL, d = NULL, a = 1, meandiff.ia = NULL, mean.ia = NULL, propdiff.ia = NULL, prop.ia = NULL, hr.ia = NULL, stderr.ia = NULL, sd.ia = NULL, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, meandiff.exp = NULL, mean.exp = NULL, propdiff.exp = NULL, prop.exp = NULL, hr.exp = NULL, meandiff.prior = NULL, mean.prior = NULL, sd.prior = NULL, propdiff.prior = NULL, prop.prior = NULL, hr.prior = NULL, D.prior = NULL)
succ_ia(type, nsamples, null.value = NULL, alternative = NULL, N = NULL, n = NULL, D = NULL, d = NULL, a = 1, meandiff.ia = NULL, mean.ia = NULL, propdiff.ia = NULL, prop.ia = NULL, hr.ia = NULL, stderr.ia = NULL, sd.ia = NULL, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, meandiff.exp = NULL, mean.exp = NULL, propdiff.exp = NULL, prop.exp = NULL, hr.exp = NULL, meandiff.prior = NULL, mean.prior = NULL, sd.prior = NULL, propdiff.prior = NULL, prop.prior = NULL, hr.prior = NULL, D.prior = NULL)
type |
Type of the endpoint. It could be |
nsamples |
Number of samples. For continuous and binary case, it can be 1 or 2. For survival endpoint, it can be only 2. |
null.value |
The specified value under null hypothesis. Default is 0 for continuous and binomial case and 1 for survival case. |
alternative |
Direction of alternate hypothesis. Can be "greater" or "less". Default is "less" for test of HR and "greater" otherwise. |
N |
Total sample size at final analysis. Cannot be missing for continuous and binary endpoint. |
n |
Total sample size at interim analysis. Cannot be missing for continuous and binary endpoint. |
D |
Total number of events at final analysis. Cannot be missing for survival endpoint. |
d |
Total number of events at interim analysis. Cannot be missing for survival endpoint. |
a |
Allocation ratio in two sample case. |
meandiff.ia |
Estimated mean difference at interim analysis. Mandatory for continuous two sample case. |
mean.ia |
Estimated mean value at interim analysis. Mandatory for continuous single sample case |
propdiff.ia |
Estimated difference in proportion at interim analysis. Mandatory for binary two sample case |
prop.ia |
Estimated proportion at interim analysis. Mandatory for binary single sample case |
hr.ia |
Estimate hazards ratio (HR) at interim analysis. Mandatory for continuous single sample case |
stderr.ia |
Standard error (SE) of estimated mean difference (in one-sample continuous case) or estimated mean (in two-sample continuous case) or estimated difference in proportion (in two-sample binary case) at interim analysis. For continuous case, if not specified, then the function attempts to estimate SE from |
sd.ia |
Standard deviation of estimated mean difference (in one-sample continuous case) or estimated mean (in two-sample continuous case) at interim analysis. If |
succ.crit |
Specify "trial" for trial success (i.e., null hypothesis is rejected at final analysis) or "clinical" for clinical success (i.e., estimated value at the final analysis is greater than clinically meaningful value as specified under |
Z.crit.final |
The rejection boundary at final analysis in Z-value scale. Either |
alpha.final |
The rejection boundary at final analysis in alpha (1-sided) scale (e.g., 0.025). Either |
clin.succ.threshold |
Clinically meaningful value. Required when |
meandiff.exp |
Expected mean difference in post interim data. Relevant for two-sample continuous case. |
mean.exp |
Expected mean in post interim data. Relevant for one-sample continuous case. |
propdiff.exp |
Expected difference in proportion in post interim data. Relevant for two-sample binary case. |
prop.exp |
Expected proportion in post interim data. Relevant for one-sample binary case. |
hr.exp |
Expected hazards ratio (HR) in post interim data. Relevant for two-sample survival case. |
meandiff.prior |
Mean value of prior distribution for mean difference. Relevant for two-sample continuous case. |
mean.prior |
Mean value of prior distribution for mean. Relevant for one-sample continuous case. |
sd.prior |
Standard deviation of prior distribution for mean difference (2-sample continuous case) or mean (1-sample continuous case) or prop (2-sample binary case) or difference of proportion (1-sample binary case) or log(HR) (2 sample survival case). |
propdiff.prior |
Mean value of prior distribution for difference in proportion. Relevant for two-sample binomial case. |
prop.prior |
Mean value of prior distribution for proportion. Relevant for one-sample binomial case. |
hr.prior |
Mean value of prior distribution for hazards ratio (HR). Relevant for two-sample survival case. |
D.prior |
Ignored if |
This function can be used to determine Conditional power (CP) and Predictive power or predictive probability of success (PPoS) based on the interim results for each of continuous (one-sample or two-samples), binary (one-sample or two-samples) and time-to-event endpoints (two-samples). The PPoS can be based on interim results only or using both prior information and interim results. The calculation of CP and PPoS are carried out assuming normal distribution for estimated parameter and normal prior distribution. This function can be used to determine clinical success (succ.crit="clinical"
) and trial success (succ.crit="trial"
). For clinical success, clin.succ.threshold
must be specified. For trial success, Z.crit.final or alpha.final
must be specified.
In order to calculate CP and PPoS, succ.ia()
should be invoked in the following form:
Continuous-two sample case (trial success):
succ.ia(type="cont", nsamples=2, null.value=, alternative=, N=, n=, a, meandiff.ia, stderr.ia=, succ.crit="trial", Z.crit.final=)
Continuous-two sample case (clinical success):
succ.ia(type="cont", nsamples=2, null.value=, alternative=, N=, n=, a, meandiff.ia, stderr.ia=, succ.crit="clinical", clin.succ.threshold=)
Continuous-one sample case (trial success):
succ.ia(type="cont", nsamples=1, null.value=, alternative=, N=, n=, mean.ia, stderr.ia=, succ.crit="trial", Z.crit.final=)
Continuous-one sample case (clinical success):
succ.ia(type="cont", nsamples=1, null.value=, alternative=, N=, n=, mean.ia, stderr.ia=, succ.crit="clinical", clin.succ.threshold=)
Binary-two sample case (trial success):
succ.ia(type="bin", nsamples=2, null.value=, alternative=, N=, n=, a, propdiff.ia, stderr.ia=, succ.crit="trial", Z.crit.final=)
Binary-two sample case (clinical success):
succ.ia(type="bin", nsamples=2, null.value=, alternative=, N=, n=, a, propdiff.ia, stderr.ia=, succ.crit="clinical", clin.succ.threshold=)
Binary-one sample case (trial success):
succ.ia(type="bin", nsamples=1, null.value=, alternative=, N=, n=, prop.ia, succ.crit="trial", Z.crit.final=)
Binary-one sample case (clinical success):
succ.ia(type="bin", nsamples=1, null.value=, alternative=, N=, n=, prop.ia, succ.crit="clinical", clin.succ.threshold=)
Survival-two sample case (trial success):
succ.ia(type="surv", nsamples=2, null.value=, alternative=, D=, d=, a, hr.ia, succ.crit="trial", Z.crit.final=)
Survival-two sample case (clinical success):
succ.ia(type="surv", nsamples=2, null.value=, alternative=, D=, d=, a, hr.ia, succ.crit="clinical", clin.succ.threshold=)
The conditional power is calculated assuming interim trend for post-interim data. If meandiff.exp
(for continuous 2-samples case), mean.exp
(for continuous 1-sample case), propdiff.exp
(for binomial 2-samples case), prop.exp
(for binomial 1-sample case), or hr.exp
(for survival 2-samples case) is specified, then conditional power would be calculated using these specified value as well.
The Predictive power or Predictive probability of success (PPoS) is calculated based interim results. On top of this, it can also incorporate prior information. The prior information can be specified as follows: If meandiff.prior, sd.prior
for continuous 2-samples case, mean.prior, sd.prior
for continuous 1-sample case, propdiff.prior, sd.prior
for binomial 2-samples case, prop.prior, sd.prior
for binomial 1-sample case, and hr.exp, sd.prior
(or, hr.exp, D.prior
) for survival 2-samples case.
Madan Gopal Kundu <[email protected]>
Kundu, M. G., Samanta, S., and Mondal, S. (2021). An introduction to the determination of the probability of a successful trial: Frequentist and Bayesian approaches. arXiv preprint arXiv:2102.13550.
succ_ia_betabinom_one
, succ_ia_betabinom_two
, PoS
#--- Lan et al. (2009), see #6. Example, outcome: Matching succ_ia(type="cont", nsamples=1, null.value=0, alternative="greater", N=225, n=45, mean.ia=0, stderr.ia=1, succ.crit="trial", Z.crit.final=1.96) #--- Dallow et al. (2011), see Figure 1. Example, outcome: Matching succ_ia(type="cont", nsamples=1, null.value=0, alternative="greater", N=100, n=50, mean.ia=1.364, stderr.ia=1, succ.crit="trial", Z.crit.final=1.64) #--- Example 1 in the paper (Continuous endpoint) succ_ia(type="cont", nsamples=2, null.value=-0.05, alternative="greater", N=1552, n=776, a=1, meandiff.ia=-0.025, sd.ia=0.16, succ.crit="trial", Z.crit.final=1.97, meandiff.exp=-0.030, meandiff.prior=0, sd.prior=0.02) #--- Example 2 in the paper (Binary endpoint) p1<- 0.379; p2<- 0.222 n1<- 105; n2<- 53 #-- Trial success succ_ia(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, n=158, a=2, propdiff.ia=p1-p2, stderr.ia=sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2), succ.crit="trial", Z.crit.final=2.012, propdiff.exp=0.20, propdiff.prior=0.20, sd.prior=sqrt(0.06)) #-- Clinical success succ_ia(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, n=158, a=2, propdiff.ia=p1-p2, stderr.ia=sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2), succ.crit="clinical", clin.succ.threshold=0.15, propdiff.exp=0.20, propdiff.prior=0.20, sd.prior=sqrt(0.06)) #--- Example 3 in the paper (Survival endpoint) #--- Trial success succ_ia(type="surv", nsamples=2, null.value=1, alternative="less", D=441, d=346, a=1, hr.ia=0.82, succ.crit="trial", Z.crit.final=2.012, hr.exp=0.75, hr.prior=0.71, D.prior=133) #--- clinical success succ_ia(type="surv", nsamples=2, null.value=1, alternative="less", D=441, d=346, a=1, hr.ia=0.82, succ.crit="clinical", clin.succ.threshold=0.80, hr.exp=0.75, hr.prior=0.71, D.prior=133)
#--- Lan et al. (2009), see #6. Example, outcome: Matching succ_ia(type="cont", nsamples=1, null.value=0, alternative="greater", N=225, n=45, mean.ia=0, stderr.ia=1, succ.crit="trial", Z.crit.final=1.96) #--- Dallow et al. (2011), see Figure 1. Example, outcome: Matching succ_ia(type="cont", nsamples=1, null.value=0, alternative="greater", N=100, n=50, mean.ia=1.364, stderr.ia=1, succ.crit="trial", Z.crit.final=1.64) #--- Example 1 in the paper (Continuous endpoint) succ_ia(type="cont", nsamples=2, null.value=-0.05, alternative="greater", N=1552, n=776, a=1, meandiff.ia=-0.025, sd.ia=0.16, succ.crit="trial", Z.crit.final=1.97, meandiff.exp=-0.030, meandiff.prior=0, sd.prior=0.02) #--- Example 2 in the paper (Binary endpoint) p1<- 0.379; p2<- 0.222 n1<- 105; n2<- 53 #-- Trial success succ_ia(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, n=158, a=2, propdiff.ia=p1-p2, stderr.ia=sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2), succ.crit="trial", Z.crit.final=2.012, propdiff.exp=0.20, propdiff.prior=0.20, sd.prior=sqrt(0.06)) #-- Clinical success succ_ia(type="bin", nsamples=2, null.value=0, alternative="greater", N=210, n=158, a=2, propdiff.ia=p1-p2, stderr.ia=sqrt(p1*(1-p1)/n1 + p2*(1-p2)/n2), succ.crit="clinical", clin.succ.threshold=0.15, propdiff.exp=0.20, propdiff.prior=0.20, sd.prior=sqrt(0.06)) #--- Example 3 in the paper (Survival endpoint) #--- Trial success succ_ia(type="surv", nsamples=2, null.value=1, alternative="less", D=441, d=346, a=1, hr.ia=0.82, succ.crit="trial", Z.crit.final=2.012, hr.exp=0.75, hr.prior=0.71, D.prior=133) #--- clinical success succ_ia(type="surv", nsamples=2, null.value=1, alternative="less", D=441, d=346, a=1, hr.ia=0.82, succ.crit="clinical", clin.succ.threshold=0.80, hr.exp=0.75, hr.prior=0.71, D.prior=133)
This function can be used to determine predictive power for trial success and clinical success based on the interim results and beta prior distribution for test of population proportion.
succ_ia_betabinom_one(N, n, x, null.value = 0, alternative = "greater", test="z", correct=TRUE, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, a = 1, b = 1)
succ_ia_betabinom_one(N, n, x, null.value = 0, alternative = "greater", test="z", correct=TRUE, succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, a = 1, b = 1)
N |
Sample size at final analysis. Cannot be missing. |
n |
Sample size at interim analysis. Cannot be missing. |
x |
Number of observed response at interim analysis. Cannot be missing. |
null.value |
The specified value under null hypothesis. Default is 0. |
alternative |
Direction of alternate hypothesis. Can be "greater" or "less". |
test |
Statistical test. Default is |
correct |
A logical indicating whether Yates' continuity correction should be applied where possible. Applies to approximate Z-test only. |
succ.crit |
Specify "trial" for trial success (i.e., null hypothesis is rejected at final analysis) or "clinical" for clinical success (i.e., estimated value at the final analysis is greater than clinically meaningful value as specified under |
Z.crit.final |
The rejection boundary at final analysis in Z-value scale. Either |
alpha.final |
The rejection boundary at final analysis in alpha (1-sided) scale (e.g., 0.025). Either |
clin.succ.threshold |
Clinically meaningful value. Required when |
a |
Value of |
b |
Value of |
This function can be used to determine Predictive power or predictive probability of success (PPoS) based on the interim results for testing of population proportion. The calculation of PoS is carried out assuming beta prior distributions for proportion. This function can be used to determine clinical success (succ.crit="clinical"
) and trial success (succ.crit="trial"
). For clinical success, clin.succ.threshold
must be specified. For trial success, Z.crit.final or alpha.final
must be specified.
Madan Gopal Kundu <[email protected]>
Kundu, M. G., Samanta, S., and Mondal, S. (2021). An introduction to the determination of the probability of a successful trial: Frequentist and Bayesian approaches. arXiv preprint arXiv:2102.13550.
succ_ia_betabinom_two
, succ_ia
, PoS
succ_ia_betabinom_one( N=40, n=30, x=25, null.value=0.6, alternative="greater", succ.crit = "trial", alpha.final = 0.016, a = 1, b=1) succ_ia_betabinom_one( N=40, n=30, x=25, null.value=0.6, alternative="greater", test="exact", succ.crit = "trial", alpha.final = 0.016, a = 1, b=1) succ_ia_betabinom_one( N=40, n=30, x=15, null.value=0.6, alternative="greater", succ.crit = "clinical", clin.succ.threshold =0.5, a = 1, b=1)
succ_ia_betabinom_one( N=40, n=30, x=25, null.value=0.6, alternative="greater", succ.crit = "trial", alpha.final = 0.016, a = 1, b=1) succ_ia_betabinom_one( N=40, n=30, x=25, null.value=0.6, alternative="greater", test="exact", succ.crit = "trial", alpha.final = 0.016, a = 1, b=1) succ_ia_betabinom_one( N=40, n=30, x=15, null.value=0.6, alternative="greater", succ.crit = "clinical", clin.succ.threshold =0.5, a = 1, b=1)
This function can be used to determine predictive power for trial success and clinical success based on the interim results and beta prior distribution for test of difference of two proportions.
succ_ia_betabinom_two(N.trt, N.con, n.trt, x.trt, n.con, x.con, alternative = "greater", test = "z", succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, a.trt = 1, b.trt = 1, a.con = 1, b.con = 1)
succ_ia_betabinom_two(N.trt, N.con, n.trt, x.trt, n.con, x.con, alternative = "greater", test = "z", succ.crit = "trial", Z.crit.final = 1.96, alpha.final = 0.025, clin.succ.threshold = NULL, a.trt = 1, b.trt = 1, a.con = 1, b.con = 1)
N.trt |
Sample size in treatment arm at final analysis. Cannot be missing. |
N.con |
Sample size in control arm at final analysis. Cannot be missing. |
n.trt |
Sample size in treatment arm at interim analysis. Cannot be missing. |
x.trt |
Number of observed response in treatment arm at interim analysis. Cannot be missing. |
n.con |
Sample size in control arm at interim analysis. Cannot be missing. |
x.con |
Number of observed response in control arm at interim analysis. Cannot be missing. |
alternative |
Direction of alternate hypothesis. Can be "greater" or "less". |
test |
Statistical test. Default is |
succ.crit |
Specify "trial" for trial success (i.e., null hypothesis is rejected at final analysis) or "clinical" for clinical success (i.e., estimated value at the final analysis is greater than clinically meaningful value as specified under |
Z.crit.final |
The rejection boundary at final analysis in Z-value scale. Either |
alpha.final |
The rejection boundary at final analysis in alpha (1-sided) scale (e.g., 0.025). Either |
clin.succ.threshold |
Clinically meaningful value. Required when |
a.trt |
Value of |
b.trt |
Value of |
a.con |
Value of |
b.con |
Value of |
This function can be used to determine Predictive power or predictive probability of success (PPoS) based on the interim results for comparison of two proportions. The calculation of PoS is carried out assuming beta prior distributions for proportions in both treatment and control arms. This function can be used to determine clinical success (succ.crit="clinical"
) and trial success (succ.crit="trial"
). For clinical success, clin.succ.threshold
must be specified. For trial success, Z.crit.final or alpha.final
must be specified.
Madan Gopal Kundu <[email protected]>
Kundu, M. G., Samanta, S., and Mondal, S. (2021). An introduction to the determination of the probability of a successful trial: Frequentist and Bayesian approaches. arXiv preprint arXiv:2102.13550.
succ_ia_betabinom_one
, succ_ia
, PoS
succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="fisher", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="z", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0.5, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 1a (results matching) succ_ia_betabinom_two( N.trt=32, N.con=32, n.trt=12, x.trt=8, n.con=12, x.con=8, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 1b (results matching) succ_ia_betabinom_two( N.trt=32, N.con=32, n.trt=12, x.trt=8, n.con=12, x.con=11, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 2 (not matching, reported 0.586, got 0.536) succ_ia_betabinom_two( N.trt=155+170, N.con=152+171, n.trt=155, x.trt=13, n.con=152, x.con=21, alternative="less", test="z", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.trt=155+170, N.con=152+171, n.trt=155, x.trt=13, n.con=152, x.con=21, alternative="less", test="fisher", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1)
succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="fisher", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="z", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.con=40, N.trt=40, n.trt=30, x.trt=20, n.con=30, x.con=15, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0.5, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 1a (results matching) succ_ia_betabinom_two( N.trt=32, N.con=32, n.trt=12, x.trt=8, n.con=12, x.con=8, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 1b (results matching) succ_ia_betabinom_two( N.trt=32, N.con=32, n.trt=12, x.trt=8, n.con=12, x.con=11, alternative="greater", test="fisher", succ.crit = "clinical", clin.succ.threshold = 0, a.trt = 1, b.trt=1, a.con=1, b.con=1) #--- Johns & Andersen, 1999, Example 2 (not matching, reported 0.586, got 0.536) succ_ia_betabinom_two( N.trt=155+170, N.con=152+171, n.trt=155, x.trt=13, n.con=152, x.con=21, alternative="less", test="z", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1) succ_ia_betabinom_two( N.trt=155+170, N.con=152+171, n.trt=155, x.trt=13, n.con=152, x.con=21, alternative="less", test="fisher", succ.crit = "trial", Z.crit.final = 1.96, a.trt = 1, b.trt=1, a.con=1, b.con=1)
Recursive partitioning for linear mixed effects model with survival data per SurvCART algorithm based on baseline partitioning variables (Kundu, 2020).
SurvCART(data, patid, timevar, censorvar, gvars, tgvars, time.dist="exponential", cens.dist="NA", event.ind=1, alpha=0.05, minsplit=40, minbucket=20, quantile=0.50, print=FALSE)
SurvCART(data, patid, timevar, censorvar, gvars, tgvars, time.dist="exponential", cens.dist="NA", event.ind=1, alpha=0.05, minsplit=40, minbucket=20, quantile=0.50, print=FALSE)
data |
name of the dataset. It must contain variable specified for |
patid |
name of the subject id variable. |
timevar |
name of the variable with follow-up times. |
censorvar |
name of the variable with censoring status. |
gvars |
list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through |
tgvars |
types (categorical or continuous) of partitioning variables specified in |
time.dist |
name of time-to-event distribution. It can be one of the following distributions: |
cens.dist |
name of censoring distribution. It can be one of the following distributions: |
event.ind |
value of the censoring variable indicating event. |
alpha |
alpha (i.e., nominal type I error) level for parameter instability test |
minsplit |
the minimum number of observations that must exist in a node in order for a split to be attempted. |
minbucket |
the minimum number of observations in any terminal node. |
quantile |
The quantile to be displayed in the visualization of tree through |
print |
if |
Construct survival tree based on heterogeneity in time-to-event and censoring distributions.
Exponential distribution: f(t)=lambda*exp(-lambda*t)
Weibull distribution: f(t)=alpha*lambda*t^(alpha-1)*exp(-lambda*t^alpha)
Lognormal distribution: f(t)=(1/t)*(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(log(t)-mu)/sigma^2]
Normal distribution: f(t)=(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(t-mu)/sigma^2]
Treeout |
contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of |
logLik.tree |
log-likelihood of the tree-structured model, based on Cox model including sub-groups as covariates |
logLik.root |
log-likelihood at the root node (i.e., without tree structure), based on Cox model without any covariate |
AIC.tree |
AIC of the tree-structured model, based on Cox model including sub-groups as covariates |
AIC.root |
AIC at the root node (i.e., without tree structure), based on Cox model without any covariate |
nodelab |
List of subgroups or terminal nodes with their description |
varnam |
List of splitting variables |
ds |
the dataset originally supplied |
event.ind |
value of the censoring variable indicating event. |
timevar |
name of the variable with follow-up times |
censorvar |
name of the variable with censoring status |
frame |
rpart compatible object |
splits |
rpart compatible object |
cptable |
rpart compatible object |
functions |
rpart compatible object |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
plot
, KMPlot
, text
, StabCat.surv
, StabCont.surv
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE) #Plot KM plot for sub-groups identified by tree KMPlot(out, xscale=365.25, type=1) KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.") #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: None out2<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: exponential out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", cens.dist="exponential", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE)
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE) #Plot KM plot for sub-groups identified by tree KMPlot(out, xscale=365.25, type=1) KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.") #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: None out2<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: exponential out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), time.dist="weibull", cens.dist="exponential", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE)
Labels the current plot of the tree generated from SurvCART or LongCART object with text.
## S3 method for class 'SurvCART' text(x, splits = TRUE, all = FALSE, use.n = FALSE, minlength = 1L, ...) ## S3 method for class 'LongCART' text(x, splits = TRUE, all = FALSE, use.n = FALSE, minlength = 1L, ...)
## S3 method for class 'SurvCART' text(x, splits = TRUE, all = FALSE, use.n = FALSE, minlength = 1L, ...) ## S3 method for class 'LongCART' text(x, splits = TRUE, all = FALSE, use.n = FALSE, minlength = 1L, ...)
x |
a fitted object of class |
splits |
similar to text.rpart;
logical flag. If |
all |
similar to text.rpart; Logical. If TRUE, all nodes are labeled, otherwise just terminal nodes. |
use.n |
similar to text.rpart;
Logical. If TRUE, adds |
minlength |
similar to text.rpart; the length to use for factor labels. A value of 1 causes them to be printed as 'a', 'b', ..... Larger values use abbreviations of the label names. See the labels.rpart function for details. |
... |
arguments to be passed to or from other methods. |
Madan Gopal Kundu [email protected]
Kundu, M. G., and Harezlak, J. (2019). Regression trees for longitudinal data with baseline covariates. Biostatistics & Epidemiology, 3(1):1-22.
Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", event.ind=1, gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE)
#--- Get the data data(GBSG2) #numeric coding of character variables GBSG2$horTh1<- as.numeric(GBSG2$horTh) GBSG2$tgrade1<- as.numeric(GBSG2$tgrade) GBSG2$menostat1<- as.numeric(GBSG2$menostat) #Add subject id GBSG2$subjid<- 1:nrow(GBSG2) #--- Run SurvCART() out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", event.ind=1, gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'), tgvars=c(0,1,0,1,0,1, 1,1), alpha=0.05, minsplit=80, minbucket=40, print=TRUE) #--- Plot tree par(xpd = TRUE) plot(out, compress = TRUE) text(out, use.n = TRUE)