Title: | Regularization for Semiparametric Additive Hazards Regression |
---|---|
Description: | Computationally efficient procedures for regularized estimation with the semiparametric additive hazards regression model. |
Authors: | Anders Gorst-Rasmussen [aut, cre] |
Maintainer: | Anders Gorst-Rasmussen <[email protected]> |
License: | GPL-2 |
Version: | 1.15.1 |
Built: | 2024-11-05 03:39:30 UTC |
Source: | https://github.com/cran/ahaz |
Fit a semiparametric additive hazards regression model. Right-censored and left-truncated survival data are supported.
ahaz(surv, X, weights, univariate=FALSE, robust=FALSE)
ahaz(surv, X, weights, univariate=FALSE, robust=FALSE)
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
univariate |
Fit all univariate models instead of the joint
model. Default is |
robust |
Robust calculation of variance. Default is |
The semiparametric additive hazards model specifies a hazard function of the form:
for where
is the vector of covariates,
the vector of regression coefficients and
is an
unspecified baseline hazard. The semiparametric additive hazards model
can be viewed as an
additive analogue of the well-known Cox proportional hazards
regression model.
Estimation is based on the estimating equations of Lin & Ying (1994).
The option univariate
is intended for screening purposes in
data sets with a large number of covariates. It is substantially faster than the
standard approach of combining ahaz
with
apply
, see the examples.
An object with S3 class "ahaz"
.
call |
The call that produced this object. |
nobs |
Number of observations. |
nvars |
Number of covariates. |
D |
A |
d |
A vector of length |
B |
An |
univariate |
Is |
data |
Formatted version of original data (for internal use). |
robust |
Is |
Lin, D.Y. & Ying, Z. (1994). Semiparametric analysis of the additive risk model. Biometrika; 81:61-71.
summary.ahaz
, predict.ahaz
,
plot.ahaz
.
The functions coef
,
vcov
, residuals
.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards model fit1 <- ahaz(surv, X) summary(fit1) # Univariate models X <- as.matrix(sorlie[,3:ncol(sorlie)]) fit2 <- ahaz(surv, X, univariate = TRUE) # Equivalent to the following (slower) solution beta <- apply(X,2,function(x){coef(ahaz(surv,x))}) plot(beta,coef(fit2))
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards model fit1 <- ahaz(surv, X) summary(fit1) # Univariate models X <- as.matrix(sorlie[,3:ncol(sorlie)]) fit2 <- ahaz(surv, X, univariate = TRUE) # Equivalent to the following (slower) solution beta <- apply(X,2,function(x){coef(ahaz(surv,x))}) plot(beta,coef(fit2))
Fast calculation of univariate association measures in the semiparametric additive risk model, adjusted for user-specified covariates
ahaz.adjust(surv, X, weights, idx, method=c("coef", "z", "crit"))
ahaz.adjust(surv, X, weights, idx, method=c("coef", "z", "crit"))
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
idx |
Vector specifying the indices of the covariates to adjust for. |
method |
The type of adjusted association measure to calculate. See details. |
The function is intended mainly for programming use and
screening purposes, when a very large number of covariates are considered and direct application of
ahaz
is unfeasible.
Running this function is equivalent to running ahaz
with
design matrix cbind(X[,i],X[,idx])
for each column X[,i]
in
X
. By utilizing basic matrix identities, ahaz.adjust
runs many times faster.
The following univariate association measures are currently implemented:
method="z"
, -statistics, obtained from a
fitted
ahaz
model.
method="coef"
, regression coefficients, obtained from a
fitted ahaz
model.
method="crit"
, the increase in
the natural loss function of the
semiparametric additive hazards model when the covariate is included
in the model.
A list containing the following elements:
call |
The call that produced this object. |
idx |
A copy of the argument |
adj |
Adjusted association statistic, as specified by |
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Adjust for first 10 covariates idx <- 1:10 a <- ahaz.adjust(surv,X,idx=idx) # Compare with (slower) solution b <- apply(X[,-idx],2,function(x){coef(ahaz(surv,cbind(x,X[,idx])))[1]}) plot(b,a$adj[-idx])
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Adjust for first 10 covariates idx <- 1:10 a <- ahaz.adjust(surv,X,idx=idx) # Compare with (slower) solution b <- apply(X[,-idx],2,function(x){coef(ahaz(surv,cbind(x,X[,idx])))[1]}) plot(b,a$adj[-idx])
Partial calculation of the quantities used in the estimating equations for ahaz.
ahaz.partial(surv, X, weights, idx)
ahaz.partial(surv, X, weights, idx)
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
idx |
Vector of indices of covariates to use in the calculations. |
The function is intended mainly for programming use when a
very large number of covariates are considered and direct application of
ahaz
is unfeasible.
The estimating equations for the semiparametric additive hazards model
are of the form with
a quadratic matrix with
number of columns equal to the number of covariates. The present
function returns
d[idx]
, D[idx,]
, and B[idx,]
;
the latter a matrix such that estimates the
covariance matrix of the regression coefficients.
A list containing the following elements:
call |
The call that produced this object. |
idx |
A copy of the argument |
nobs |
Number of observations. |
nvars |
Number of covariates. |
d |
Vector of length |
D |
Matrix of size |
B |
Matrix of size |
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Get D for the first 10 covariates only a<-ahaz.partial(surv,X,idx=1:10) pD1 <- a$D # Equivalent to the (slower) solution b <- ahaz(surv,X) pD2 <- b$D[1:10,] max(abs(pD1-pD2))
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Get D for the first 10 covariates only a<-ahaz.partial(surv,X,idx=1:10) pD1 <- a$D # Equivalent to the (slower) solution b <- ahaz(surv,X) pD2 <- b$D[1:10,] max(abs(pD1-pD2))
Define the type of tuning method used for regularization. Currently only used by tune.ahazpen
.
# Cross-validation cv.control(nfolds=5, reps=1, foldid=NULL, trace=FALSE) # BIC-inspired bic.control(factor = function(nobs){log(nobs)})
# Cross-validation cv.control(nfolds=5, reps=1, foldid=NULL, trace=FALSE) # BIC-inspired bic.control(factor = function(nobs){log(nobs)})
nfolds |
Number of folds for cross-validation. Default is
|
reps |
Number of repetitions of cross-validation with
|
foldid |
An optional vector of values between 1 and |
trace |
Print progress of cross-validation. Default is |
factor |
Defines how strongly the number of nonzero penalty parameters penalizes the score in a BIC-type criterion; see the details. |
For examples of usage, see tune.ahazpen
.
The regression coefficients of the semiparametric additive hazards
model are estimated by solving a linear system of estimating equations of the form
with respect to
. The natural loss function
for such a linear function is of the least-squares type
This loss function is used for cross-validation as described by Martinussen & Scheike (2008).
Penalty parameter selection via a BIC-inspired approach was described by
Gorst-Rasmussen & Scheike (2011). With is the degrees of freedom and
the number of
observations, we consider a BIC inspired criterion of the form
where is a scaling constant included to remove dependency on the
time scale and better mimick the behavior of a ‘real’ (likelihood) BIC. The default
factor=function(n){log(n)}
has
desirable theoretical properties but may be conservative in practice.
An object with S3 class "ahaz.tune.control"
.
type |
Type of penalty. |
factor |
Function specified by |
getfolds |
A function specifying how folds are calculated, if applicable. |
rep |
How many repetitions of cross-validation, if applicable. |
trace |
Print out progress? |
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
Martinussen, T. & Scheike, T. H. (2008). Covariate selection for the semiparametric additive risk model. Scandinavian Journal of Statistics; 36:602-619.
Fast and scalable model selection for the semiparametric additive hazards model via univariate screening combined with penalized regression.
ahazisis(surv, X, weights, standardize=TRUE, nsis=floor(nobs/1.5/log(nobs)), do.isis=TRUE, maxloop=5, penalty=sscad.control(), tune=cv.control(), rank=c("FAST","coef","z","crit"))
ahazisis(surv, X, weights, standardize=TRUE, nsis=floor(nobs/1.5/log(nobs)), do.isis=TRUE, maxloop=5, penalty=sscad.control(), tune=cv.control(), rank=c("FAST","coef","z","crit"))
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
standardize |
Logical flag for variable standardization, prior to
model fitting. Estimates are always returned on
the original scale. Default is |
nsis |
Number of covariates to recruit initially. If
|
.
do.isis |
Perform iterated independent screening? |
maxloop |
Maximal number of iterations of the algorithm if |
rank |
Method to use for (re)recruitment of variables. See details. |
penalty |
A description of the penalty function to be used for
the variable selection part. This can be a character string naming a penalty
function (currently |
tune |
A description of the tuning method to be used for the
variable selection part. This can be
a character string naming a tuning control
function (currently |
The function is a basic implementation of the iterated sure independent screening method described in Gorst-Rasmussen & Scheike (2011). Briefly, the algorithm does the following:
Recruits the nsis
most relevant covariates by ranking them according to the univariate ranking
method described by rank
.
Selects, using ahazpen
with penalty function described
in penalty
, a model among the
top two thirds of the nsis
most relevant covariates. Call the
size of this model .
Recruits 'nsis
minus ' new covariates among the non-selected
covariates by ranking their relevance according to the univariate
ranking method described in
rank
, adjusted for the already
selected variables (using an unpenalized semiparametric additive
hazards model).
Steps 2-3 are iterated for maxloop
times, or until nsis
covariates has been recruited, or until the
set of selected covariate is stable between two iterations; whichever
comes first.
The following choices of ranking method exist:
rank="FAST"
corresponds to ranking, in the initial
recruitment step only, by the basic FAST- statistic
described in Gorst-Rasmussen & Scheike (2011). If do.isis=TRUE
then the algorithm sets rank="z"
for subsequent rankings.
rank="coef"
corresponds to ranking by absolute value of
(univariate) regression coefficients, obtained via ahaz
rank="z"
corresponds to ranking by the -statistic of
the (univariate) regression coefficients, obtained via
ahaz
rank="crit"
corresponds to ranking by the size
of the decrease in
the (univariate) natural loss function used for estimation by ahaz
.
An object with S3 class "ahazisis"
.
call |
The call that produced this object. |
initRANKorder |
The initial ranking order. |
detail.pickind |
List (of length at most |
detail.ISISind |
List (of length at most |
detail.ISIScoef |
List (of length at most |
SISind |
Indices of covariates selected in the initial recruitment step. |
ISISind |
Indices of the final set of covariates selected by the iterated algorithm. |
ISIScoef |
Vector of the penalized regression coefficients of the
covariates in |
nsis |
The argument |
do.isis |
The argument |
maxloop |
The argument |
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
print.ahazisis
, ahazpen
, ahaz.adjust
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Basic ISIS/SIS with a single step set.seed(10101) m1 <- ahazisis(surv,X,maxloop=1,rank="coef") m1 # Indices of the variables from the initial recruitment step m1$SISind # Indices of selected variables m1$ISISind # Check fit score <- X[,m1$ISISind]%*%m1$ISIScoef plot(survfit(surv~I(score>median(score))))
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Basic ISIS/SIS with a single step set.seed(10101) m1 <- ahazisis(surv,X,maxloop=1,rank="coef") m1 # Indices of the variables from the initial recruitment step m1$SISind # Indices of selected variables m1$ISISind # Check fit score <- X[,m1$ISISind]%*%m1$ISIScoef plot(survfit(surv~I(score>median(score))))
Fit a semiparametric additive hazards model via penalized estimating equations using, for example, the lasso penalty. The complete regularization path is computed at a grid of values for the penalty parameter lambda via the method of cyclic coordinate descent.
ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(), nlambda=100, dfmax=nvars, pmax=min(nvars, 2*dfmax), lambda.minf=ifelse(nobs < nvars,0.05, 1e-4), lambda, penalty.wgt=NULL, keep=NULL, control=list())
ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(), nlambda=100, dfmax=nvars, pmax=min(nvars, 2*dfmax), lambda.minf=ifelse(nobs < nvars,0.05, 1e-4), lambda, penalty.wgt=NULL, keep=NULL, control=list())
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
standardize |
Logical flag for variable standardization, prior to
model fitting. Estimates are always returned on
the original scale. Default is |
penalty |
A description of the penalty function to be used for
model fitting. This can be a character string naming a penalty
function (currently |
nlambda |
The number of |
dfmax |
Limit the maximum number of variables in the
model. Unless a complete
regularization path is needed, it is highly
recommended to initially choose a relatively smaller value of
|
pmax |
Limit the maximum number of variables to ever be considered by the coordinate descent algorithm. |
lambda.minf |
Smallest value of |
lambda |
An optional user supplied sequence of penalty parameters. Typical usage
is to have the
program compute its own |
penalty.wgt |
A vector of nonnegative penalty weights for each
regression coefficient. This is a number that multiplies |
keep |
A vector of indices of variables which should always be included in
the model (no penalization). Equivalent to specifying a |
control |
A list of parameters for controlling the
model fitting algorithm. The list is passed to |
Fits the sequence of models implied by the penalty function
penalty
, the sequence of penalty parameters lambda
by
using the very efficient method of cyclic coordinate descent.
For data sets with a very large number of covariates, it is recommended
to only calculate partial paths by specifying a smallish value of
dmax
.
The sequence lambda
is computed automatically by the algorithm
but can also be set (semi)manually by specifying nlambda
or
lambda
. The stability and efficiency of the algorithm is highly
dependent on the grid lambda
values being reasonably dense, and
lambda
(and nlambda
) should be specified accordingly. In
particular, it is not recommended to specify a single or a few lambda
values. Instead, a partial regularization path should be calculated and
the functions predict.ahazpen
or
coef.ahazpen
should be used to extract coefficient
estimates at specific lambda values.
An object with S3 class "ahazpen"
.
call |
The call that produced this object |
beta |
An |
lambda |
The sequence of actual |
df |
The number of nonzero coefficients for each value of
|
nobs |
Number of observations. |
nvars |
Number of covariates. |
surv |
A copy of the argument |
npasses |
Total number of passes by the fitting algorithm over the data, for all lambda values. |
penalty.wgt |
The actually used |
penalty |
An object of class |
dfmax |
A copy of |
penalty |
A copy of |
Gorst-Rasmussen A., Scheike T. H. (2012). Coordinate Descent Methods for the Penalized Semiparametric Additive Hazards Model. Journal of Statistical Software, 47(9):1-17. https://www.jstatsoft.org/v47/i09/
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
Leng, C. & Ma, S. (2007). Path consistent model selection in additive risk model via Lasso. Statistics in Medicine; 26:3753-3770.
Martinussen, T. & Scheike, T. H. (2008). Covariate selection for the semiparametric additive risk model. Scandinavian Journal of Statistics; 36:602-619.
Zou, H. & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics; 36:1509-1533.
print.ahazpen
, predict.ahazpen
,
coef.ahazpen
, plot.ahazpen
,
tune.ahazpen
.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model fit1 <- ahazpen(surv, X,penalty="lasso", dfmax=30) fit1 plot(fit1) # Extend the grid to contain exactly 100 lambda values lrange <- range(fit1$lambda) fit2 <- ahazpen(surv, X,penalty="lasso", lambda.minf=lrange[1]/lrange[2]) plot(fit2) # User-specified lambda sequence lambda <- exp(seq(log(0.30), log(0.1), length = 100)) fit2 <- ahazpen(surv, X, penalty="lasso", lambda = lambda) plot(fit2) # Advanced usage - specify details of the penalty function fit4 <- ahazpen(surv, X,penalty=sscad.control(nsteps=2)) fit4 fit5 <- ahazpen(surv, X,penalty=lasso.control(alpha=0.1)) plot(fit5)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model fit1 <- ahazpen(surv, X,penalty="lasso", dfmax=30) fit1 plot(fit1) # Extend the grid to contain exactly 100 lambda values lrange <- range(fit1$lambda) fit2 <- ahazpen(surv, X,penalty="lasso", lambda.minf=lrange[1]/lrange[2]) plot(fit2) # User-specified lambda sequence lambda <- exp(seq(log(0.30), log(0.1), length = 100)) fit2 <- ahazpen(surv, X, penalty="lasso", lambda = lambda) plot(fit2) # Advanced usage - specify details of the penalty function fit4 <- ahazpen(surv, X,penalty=sscad.control(nsteps=2)) fit4 fit5 <- ahazpen(surv, X,penalty=lasso.control(alpha=0.1)) plot(fit5)
Controls the numerical algorithm for fitting
the penalized semiparametric additive hazards model. This is typically
only used in a call to ahazpen
.
ahazpen.fit.control(thresh=1e-5, maxit=100000, ...)
ahazpen.fit.control(thresh=1e-5, maxit=100000, ...)
thresh |
Declare convergence when the maximal relative change from the
last iteration is less than |
maxit |
Maximal number passes by the algorithm over the data for
all values of the regularization parameter lambda. Default is |
... |
For future methods. |
A list with elements named as the arguments.
Describe the penalty function to be used in the penalized
semiparametric additive hazards model. Typically only used in a call
to ahazpen
or tune.ahazpen
.
# (Adaptive) lasso/elasticnet lasso.control(alpha=1, ada.wgt=NULL) # Stepwise SCAD sscad.control(a=3.7, nsteps=1, init.sol=NULL, c=NULL)
# (Adaptive) lasso/elasticnet lasso.control(alpha=1, ada.wgt=NULL) # Stepwise SCAD sscad.control(a=3.7, nsteps=1, init.sol=NULL, c=NULL)
alpha |
Elasticnet penalty parameter with default
|
ada.wgt |
Optional covariate weights used for fitting the adaptive
lasso. Default is not to use weights, i.e. fit the standard lasso. A
user-specified |
a |
Parameter of the stepwise SCAD penalty, see details. Default
is |
nsteps |
Number of steps in stepwise SCAD. Default is
|
init.sol |
Optional initial solution for stepwise SCAD
consisting of a numerical vector of length corresponding to the number of covariates in the
model. Default
is a vector of regression coefficients obtained from
|
c |
Optional scaling factor for stepwise SCAD. Usually it is not necessary to change supply this; see the details. |
The lasso/elasticnet penalty function takes the form
where . Choosing
encourages joint selection of correlated covariates and may
improve the fit when there are substantial correlations among
covariates.
The stepwise SCAD penalty function takes the form
where is some
initial estimate,
is a scaling factor, and for
the indicator function
The scaling factor controls how ‘different’ the
stepwise SCAD penalty is from the standard lasso penalty (and is
also used to remove dependency of the penalty on the scaling of the time
axis).
The one-step SCAD method of Zou & Li (2008) corresponds to taking
equal to the estimator derived from
ahaz
. See
Gorst-Rasmussen & Scheike (2011) for details. By iterating such
one-step SCAD and updating the initial solution accordingly,
the algorithm approximates the solution obtained using full SCAD. Note
that calculating the full SCAD solution generally leads to a nonconvex
optimization problem: multiple solutions and erratic behavior of
solution paths can be an issue.
The arguments ada.wgt
and init.sol
can be specified as
functions of the observations. This is convenient, for example, when
using cross-validation for tuning parameter selection. Such a function
must be specified precisely with the arguments surv
,
X
and weights
and must output a numeric vector
of length corresponding to the number of covariates. ahazpen
will take care of scaling so the function should produce output on the
original scale. See the examples here as well as the examples for
tune.ahazpen
for usage of this feature in practice.
An object with S3 class "ahaz.pen.control"
.
type |
Type of penalty. |
init.sol |
Function specifying the initial solution, if applicable. |
alpha |
Value of |
nsteps |
Number of steps for stepwise SCAD penalty, if applicable. |
a |
Parameter for stepwise SCAD penalty, if applicable. |
c |
Scaling factor for stepwise SCAD penalty, if applicable. |
ada.wgt |
Function specifying the weights for the adaptive lasso penalty, if applicable. |
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
Leng, C. & Ma, S. (2007). Path consistent model selection in additive risk model via Lasso. Statistics in Medicine; 26:3753-3770.
Martinussen, T. & Scheike, T. H. (2008). Covariate selection for the semiparametric additive risk model. Scandinavian Journal of Statistics; 36:602-619.
Zou, H. & Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics; 36:1509-1533.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model with elasticnet penalty model <- ahazpen(surv,X,penalty=lasso.control(alpha=0.1),dfmax=30) plot(model) # Adaptive lasso with weights 1/|beta_i|^0.5. Note that, although # we do not use 'weights', it MUST be included as an argument adafun <- function(surv,X,weights) return(1/abs(coef(ahaz(surv,X)))^.5) model <- ahazpen(surv,X[,1:50],penalty=lasso.control(ada.wgt=adafun)) plot(model) # One-step SCAD with initial solution derived from univariate regressions scadfun <- function(surv,X,weights){ fit <- ahaz(surv,X,univariate=TRUE) return(coef(fit)) } set.seed(10101) model.ssc <- tune.ahazpen(surv,X,dfmax=30,penalty=sscad.control(init.sol=scadfun)) plot(model.ssc)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model with elasticnet penalty model <- ahazpen(surv,X,penalty=lasso.control(alpha=0.1),dfmax=30) plot(model) # Adaptive lasso with weights 1/|beta_i|^0.5. Note that, although # we do not use 'weights', it MUST be included as an argument adafun <- function(surv,X,weights) return(1/abs(coef(ahaz(surv,X)))^.5) model <- ahazpen(surv,X[,1:50],penalty=lasso.control(ada.wgt=adafun)) plot(model) # One-step SCAD with initial solution derived from univariate regressions scadfun <- function(surv,X,weights){ fit <- ahaz(surv,X,univariate=TRUE) return(coef(fit)) } set.seed(10101) model.ssc <- tune.ahazpen(surv,X,dfmax=30,penalty=sscad.control(init.sol=scadfun)) plot(model.ssc)
Plot method for a fitted semiparametric additive hazards model; plots the Breslow estimate of underlying cumulative hazard function.
## S3 method for class 'ahaz' plot(x, ...)
## S3 method for class 'ahaz' plot(x, ...)
x |
The result of an |
... |
Additional graphical arguments passed to the |
Calling plot.ahaz
is equivalent to first calling ahaz
, then
calling predict
with type="cumhaz"
, and finally
calling plot
.
ahaz
, predict.ahaz
, plot.cumahaz
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards model fit <- ahaz(surv, X) plot(fit)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards model fit <- ahaz(surv, X) plot(fit)
Plots regularization paths for fitted penalized semiparametric additive hazards model.
## S3 method for class 'ahazpen' plot(x, xvar=c("norm","lambda"), labels=FALSE, df=TRUE, ylab="Regression coefficients", xlab=xname,...)
## S3 method for class 'ahazpen' plot(x, xvar=c("norm","lambda"), labels=FALSE, df=TRUE, ylab="Regression coefficients", xlab=xname,...)
x |
The result of an |
xvar |
Scaling for first axis. Options are the |
labels |
Try to display indices for the regression coefficients in the right-hand
margin. Default is |
df |
Display number of nonzero parameters in top margin. Default
is |
ylab |
Label for y-axis. |
xlab |
Label for x-axis. The default is either "L1 norm" or
|
... |
Additional graphical arguments passed to the |
ahazpen
, print.ahazpen
, predict.ahazpen
, coef.ahazpen
.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model fit <- ahazpen(surv, X, dfmax=50) par(mfrow=c(1,2)); plot(fit); plot(fit,xvar="lambda") # With labels only plot(fit,labels=TRUE,df=FALSE)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model fit <- ahazpen(surv, X, dfmax=50) par(mfrow=c(1,2)); plot(fit); plot(fit,xvar="lambda") # With labels only plot(fit,labels=TRUE,df=FALSE)
Plots the Breslow estimate of cumulative hazard function,
as obtained from the predict.ahaz
## S3 method for class 'cumahaz' plot(x, ...)
## S3 method for class 'cumahaz' plot(x, ...)
x |
Result of a call to the |
... |
Additional graphical arguments passed to the |
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards regression model fit <- ahaz(surv, X) # Cumulative hazard cumhaz <- predict(fit, type="cumhaz") plot(cumhaz)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards regression model fit <- ahaz(surv, X) # Cumulative hazard cumhaz <- predict(fit, type="cumhaz") plot(cumhaz)
Plot, as a function of the penalty parameter, the curve of tuning scores produced when tuning a penalized semiparametric additive hazards model.
## S3 method for class 'tune.ahazpen' plot(x, df = TRUE, ...)
## S3 method for class 'tune.ahazpen' plot(x, df = TRUE, ...)
x |
The result of a call to |
df |
Display number of nonzero parameters in top margin. Default
is |
... |
Additional graphical arguments passed to the |
A plot is produced displaying the tuning score for each value
of penalty parameter
(alongside upper and lower standard deviation curves, if cross-validation
has been used). The value of
lambda
which minimizes the estimated tuning score
is indicated with a dashed vertical line.
ahazpen
, tune.ahazpen
, print.tune.ahazpen
.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Do 10 fold cross-validation set.seed(10101) tune.fit <- tune.ahazpen(surv, X, penalty="lasso", dfmax=50, tune = cv.control(nfolds=10)) plot(tune.fit)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Do 10 fold cross-validation set.seed(10101) tune.fit <- tune.ahazpen(surv, X, penalty="lasso", dfmax=50, tune = cv.control(nfolds=10)) plot(tune.fit)
Compute regression coefficients, linear predictor, cumulative hazard function, or integrated martingale residuals for a fitted semiparametric additive hazards model.
## S3 method for class 'ahaz' predict(object, newX, type=c("coef", "lp", "residuals", "cumhaz"), beta=NULL, ...) ## S3 method for class 'ahaz' coef(object, ...) ## S3 method for class 'ahaz' vcov(object, ...) ## S3 method for class 'ahaz' residuals(object, ...)
## S3 method for class 'ahaz' predict(object, newX, type=c("coef", "lp", "residuals", "cumhaz"), beta=NULL, ...) ## S3 method for class 'ahaz' coef(object, ...) ## S3 method for class 'ahaz' vcov(object, ...) ## S3 method for class 'ahaz' residuals(object, ...)
object |
The result of an |
newX |
Optional new matrix of covariates at which to do
predictions. Currently only supported for |
type |
Type of prediction. Options are the regression coefficients
(" |
beta |
Optional vector of regression coefficients. If unspecified,
the regression coefficients derived from |
... |
For future methods. |
The Breslow estimator of the baseline cumulative hazard is described in Lin & Ying (1994).
The regression coefficients in the semiparametric additive hazards
model are obtained as the
solution
to a quadratic system of linear equations
. The
(integrated) martingale residuals
for
are vectors, of length
corresponding to the number of covariates, so that
The residuals estimate integrated
martingales and are
asymptotically distributed as mean-zero IID multivariate Gaussian. They can be used to derive a sandwich-type variance
estimator for regression coefficients (implemented in
summary.ahaz
when robust=TRUE
is specified). They can moreover be used for implementing consistent standard error
estimation under clustering; or for implementing resampling-based
inferential methods.
See Martinussen & Scheike (2006), Chapter 5.4 for details.
For type="coef"
and type="lp"
, a vector of
predictions.
For type="coef"
, a matrix of (integrated) martingale
residuals, with number of columns corresponding to the number of
covariates.
For type="cumhaz"
, an object with S3 class "cumahaz"
consisting of:
time |
Jump times for the cumulative hazard estimate. |
cumhaz |
The cumulative hazard estimate. |
event |
Status at jump times (1 corresponds to death, 0 corresponds to entry/exit). |
Martinussen, T. & Scheike, T. H. & (2006). Dynamic Regression Models for Survival Data. Springer.
ahaz
, summary.ahaz
, plot.cumahaz
.
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards regression model fit <- ahaz(surv, X) # Parameter estimates coef(fit) # Linear predictor, equivalent to X%*%coef(fit) predict(fit,type="lp") # Cumulative baseline hazard cumahaz <- predict(fit, type="cumhaz") # Residuals - model fit resid <- predict(fit, type = "residuals") # Decorrelate, standardize, and check QQ-plots stdres <- apply(princomp(resid)$scores,2,function(x){x/sd(x)}) par(mfrow = c(2,2)) for(i in 1:4){ qqnorm(stdres[,i]) abline(c(0,1)) } # Residuals - alternative variance estimation resid <- residuals(fit) cov1 <- summary(fit)$coef[,2] invD <- solve(fit$D) Best<-t(resid)%*%resid cov2 <- invD %*% Best %*% invD # Compare with (nonrobust) SEs from 'summary.ahaz' plot(cov1, sqrt(diag(cov2)),xlab="Nonrobust",ylab="Robust") abline(c(0,1))
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:24]) # Fit additive hazards regression model fit <- ahaz(surv, X) # Parameter estimates coef(fit) # Linear predictor, equivalent to X%*%coef(fit) predict(fit,type="lp") # Cumulative baseline hazard cumahaz <- predict(fit, type="cumhaz") # Residuals - model fit resid <- predict(fit, type = "residuals") # Decorrelate, standardize, and check QQ-plots stdres <- apply(princomp(resid)$scores,2,function(x){x/sd(x)}) par(mfrow = c(2,2)) for(i in 1:4){ qqnorm(stdres[,i]) abline(c(0,1)) } # Residuals - alternative variance estimation resid <- residuals(fit) cov1 <- summary(fit)$coef[,2] invD <- solve(fit$D) Best<-t(resid)%*%resid cov2 <- invD %*% Best %*% invD # Compare with (nonrobust) SEs from 'summary.ahaz' plot(cov1, sqrt(diag(cov2)),xlab="Nonrobust",ylab="Robust") abline(c(0,1))
Compute regression coefficient estimates, linear predictor, cumulative hazard function, or integrated martingale residuals for a fitted penalized semiparametric additive hazards model.
## S3 method for class 'ahazpen' predict(object, newX, type=c("coef","lp","residuals","cumhaz"), lambda=NULL, ...) ## S3 method for class 'ahazpen' coef(object, ...)
## S3 method for class 'ahazpen' predict(object, newX, type=c("coef","lp","residuals","cumhaz"), lambda=NULL, ...) ## S3 method for class 'ahazpen' coef(object, ...)
object |
The result of an |
newX |
New matrix of covariates at which to do
predictions. |
lambda |
Value of lambda for at which predictions are
to be made. This argument is required for |
type |
The type of prediction. Options are the regression coefficients
(" |
... |
For future methods. |
See the details in predict.ahaz
for information on
the different types of predictions.
For type="coef"
and type="lp"
, a
matrix of regression coefficients, respectively linear predictions for
each value of the penalty parameter.
For type="residuals"
, a matrix of (integrated) martingale residuals
associated with the nonzero penalized regression coefficients for a
regularization parameter equal to .
For type="cumhaz"
, an object with S3 class "cumahaz"
based on the regression coefficients estimated for a
regularization parameter equal to , the object containing:
time |
Jump times for the cumulative hazard estimate. |
cumhaz |
The cumulative hazard estimate. |
event |
Status at jump times (1 corresponds to death, 0 corresponds to entry/exit). |
ahazpen
, print.ahazpen
,
plot.ahazpen
, predict.ahaz
, plot.cumahaz
.
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model w/lasso penalty fit <- ahazpen(surv, X, dfmax=100) # Coefficients beta <- predict(fit,X,lambda=0.08,type="coef") barplot(as.numeric(beta)) # Linear predictions linpred <- predict(fit,X,lambda=0.1,type="lp") riskgrp <- factor(linpred < median(linpred)) plot(survfit(surv~riskgrp)) # Residuals resid <- predict(fit, X, lambda=0.1, type = "residuals") par(mfrow = c(1,2)) hist(resid[,1],main=colnames(resid)[1]) hist(resid[,2],main=colnames(resid)[2]) # Cumulative hazard cumhaz <- predict(fit,X,lambda=0.1,type="cumhaz") plot(cumhaz)
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model w/lasso penalty fit <- ahazpen(surv, X, dfmax=100) # Coefficients beta <- predict(fit,X,lambda=0.08,type="coef") barplot(as.numeric(beta)) # Linear predictions linpred <- predict(fit,X,lambda=0.1,type="lp") riskgrp <- factor(linpred < median(linpred)) plot(survfit(surv~riskgrp)) # Residuals resid <- predict(fit, X, lambda=0.1, type = "residuals") par(mfrow = c(1,2)) hist(resid[,1],main=colnames(resid)[1]) hist(resid[,2],main=colnames(resid)[2]) # Cumulative hazard cumhaz <- predict(fit,X,lambda=0.1,type="cumhaz") plot(cumhaz)
Compute regression coefficient estimates, linear predictor, cumulative hazard function, or integrated martingale residuals for a fitted and tuned penalized semiparametric additive hazards model.
## S3 method for class 'tune.ahazpen' predict(object, newX, lambda="lambda.min", ...) ## S3 method for class 'tune.ahazpen' coef(object, ...)
## S3 method for class 'tune.ahazpen' predict(object, newX, lambda="lambda.min", ...) ## S3 method for class 'tune.ahazpen' coef(object, ...)
object |
The result of an |
newX |
New matrix of covariates at which to do
predictions. Required for some types of predictions, see |
lambda |
Value of lambda at which predictions are
to be made. Required for some types of predictions, see
|
... |
Additional arguments to be passed to
|
See the details in predict.ahazpen
for information on
the available types of predictions.
The obejct returned depends on the details in the argument ...
passed
to predict.ahazpen
.
predict.ahazpen
, ahazpen
, print.ahazpen
,
plot.ahazpen
, predict.ahaz
, plot.cumahaz
.
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model w/lasso penalty cv.fit <- tune.ahazpen(surv, X, dfmax=100, tune="cv") # Predict coefficients at cv.fit$lambda.min coef(cv.fit) # Predict risk score at cv.fit$lambda.min predict(cv.fit,newX=X,type="lp")
data(sorlie) set.seed(10101) # Break ties time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Fit additive hazards regression model w/lasso penalty cv.fit <- tune.ahazpen(surv, X, dfmax=100, tune="cv") # Predict coefficients at cv.fit$lambda.min coef(cv.fit) # Predict risk score at cv.fit$lambda.min predict(cv.fit,newX=X,type="lp")
Print method for sure independence screening based on the semiparametric additive hazards model.
## S3 method for class 'ahazisis' print(x, digits=max(3, getOption("digits") - 3), ...)
## S3 method for class 'ahazisis' print(x, digits=max(3, getOption("digits") - 3), ...)
x |
Fitted |
digits |
Significant digits to print. |
... |
For future methods. |
The call that produced x
is printed, alongside the
number of covariates initially recruited, the number of covariates
finally recruited (if applicable) and the number of iterations (if applicable).
Print method for fitted penalized semiparametric additive hazards model.
## S3 method for class 'ahazpen' print(x, digits=max(3, getOption("digits") - 3), ...)
## S3 method for class 'ahazpen' print(x, digits=max(3, getOption("digits") - 3), ...)
x |
Fitted |
digits |
Significant digits to print. |
... |
For future methods. |
The call that produced x
is printed, alongside the
number of observations, the number of covariates, and details on the
sequence of penalty parameters.
ahazpen
, predict.ahazpen
, coef.ahazpen
.
Produces a printed summary of a fitted semiparametric additive hazards model.
## S3 method for class 'summary.ahaz' print(x, digits=max(getOption("digits") - 3, 3), signif.stars=getOption("show.signif.stars"), ...)
## S3 method for class 'summary.ahaz' print(x, digits=max(getOption("digits") - 3, 3), signif.stars=getOption("show.signif.stars"), ...)
x |
The result of a call to |
digits |
Significant digits to print. |
signif.stars |
Show stars to highlight small p-values. |
... |
For future methods. |
summary.ahaz
, ahaz
, plot.ahaz
.
Print method for tune.ahazpen
objects.
## S3 method for class 'tune.ahazpen' print(x, digits=max(3, getOption("digits") - 3), ...)
## S3 method for class 'tune.ahazpen' print(x, digits=max(3, getOption("digits") - 3), ...)
x |
The result of a call to |
digits |
Significant digits in printout. |
... |
Additional print arguments. |
The call that produced x
is printed, alongside the number of
penalty parameters used, the value of the
optimal penalty and the number of non-zero regression coefficients at
the optimal penalty parameter.
ahazpen
, tune.ahazpen
, plot.tune.ahazpen
.
Dataset containing 549 gene expression measurement, exit time and exit status in a study of breast cancer among 115 women.
data(sorlie)
data(sorlie)
Time to exit.
Status at exit (censoring = 0, event = 1).
Gene expression measurements.
Soerlie T., et al. (2003). Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci 100:8418-8423
data(sorlie)
data(sorlie)
Produces a summary of a fitted semiparametric additive hazards model.
## S3 method for class 'ahaz' summary(object, ...)
## S3 method for class 'ahaz' summary(object, ...)
object |
The result of an |
... |
For future methods. |
An object with S3 class "summary.ahaz"
.
call |
The call that produced this object. |
coefficients |
Vector of regression coefficients. |
cov |
Estimated covariance matrix of regression coefficients. |
nobs |
Number of observations. |
nvars |
Number of covariates |
waldtest |
Vector of quantities from a Wald test. |
univar |
Logical: summarizing univariate
regressions (option |
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:25]) # Fit additive hazards model fit1 <- ahaz(surv, X) summary(fit1)
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,15:25]) # Fit additive hazards model fit1 <- ahaz(surv, X) summary(fit1)
Tuning of penalty parameters for the penalized semiparametric additive hazards model via cross-validation - or via non-stochastic procedures, akin to BIC for likelihood-based models.
tune.ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(), tune=cv.control(), dfmax=nvars, lambda, ...)
tune.ahazpen(surv, X, weights, standardize=TRUE, penalty=lasso.control(), tune=cv.control(), dfmax=nvars, lambda, ...)
surv |
Response in the form of a survival object, as returned by the
function |
X |
Design matrix. Missing values are not supported. |
weights |
Optional vector of observation weights. Default is 1 for each observation. |
standardize |
Logical flag for variable standardization, prior to
model fitting. Parameter estimates are always returned on
the original scale. Default is |
penalty |
A description of the penalty function to be used for
model fitting. This can be a character string naming a penalty
function (currently |
dfmax |
Limit the maximum number of covariates included in the
model. Default is |
lambda |
An optional user supplied sequence of penalty parameters. Typical usage
is to have the
program compute its own |
tune |
A description of the tuning method to be used. This can be
a character string naming a tuning control
function (currently |
... |
Additional arguments to be passed to |
The function performs an initial penalized fit based on the
penalty supplied in penalty
to obtain a sequence of
penalty parameters. Subsequently, it selects among these an optimal penalty parameter based on
the tuning control function described in tune
, see ahaz.tune.control
.
An object with S3 class "tune.ahazpen"
.
call |
The call that produced this object. |
lambda |
The actual sequence of |
tunem |
The tuning score for each value of |
tunesd |
Estimate of the cross-validated standard error, if |
tunelo |
Lower curve = |
tuneup |
Upper curve = |
lambda.min |
Value of |
df |
Number of non-zero coefficients at each value of
|
tune |
The selected |
penalty |
The selected |
foldsused |
Folds actually used, if |
Gorst-Rasmussen, A. & Scheike, T. H. (2011). Independent screening for single-index hazard rate models with ultra-high dimensional features. Technical report R-2011-06, Department of Mathematical Sciences, Aalborg University.
ahaz.tune.control
, plot.tune.ahazpen
, ahazpen
.
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Training/test data set.seed(20202) train <- sample(1:nrow(sorlie),76) test <- setdiff(1:nrow(sorlie),train) # Run cross validation on training data set.seed(10101) cv.las <- tune.ahazpen(surv[train,], X[train,],dfmax=30) plot(cv.las) # Check fit on the test data testrisk <- predict(cv.las,X[test,],type="lp") plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk") # Advanced example, cross-validation of one-step SCAD # with initial solution derived from univariate models. # Since init.sol is specified as a function, it is # automatically cross-validated as well scadfun<-function(surv,X,weights){coef(ahaz(surv,X,univariate=TRUE))} set.seed(10101) cv.ssc<-tune.ahazpen(surv[train,],X[train,], penalty=sscad.control(init.sol=scadfun), tune=cv.control(rep=5),dfmax=30) # Check fit on test data testrisk <- predict(cv.ssc,X[test,],type="lp") plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")
data(sorlie) # Break ties set.seed(10101) time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Survival data + covariates surv <- Surv(time,sorlie$status) X <- as.matrix(sorlie[,3:ncol(sorlie)]) # Training/test data set.seed(20202) train <- sample(1:nrow(sorlie),76) test <- setdiff(1:nrow(sorlie),train) # Run cross validation on training data set.seed(10101) cv.las <- tune.ahazpen(surv[train,], X[train,],dfmax=30) plot(cv.las) # Check fit on the test data testrisk <- predict(cv.las,X[test,],type="lp") plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk") # Advanced example, cross-validation of one-step SCAD # with initial solution derived from univariate models. # Since init.sol is specified as a function, it is # automatically cross-validated as well scadfun<-function(surv,X,weights){coef(ahaz(surv,X,univariate=TRUE))} set.seed(10101) cv.ssc<-tune.ahazpen(surv[train,],X[train,], penalty=sscad.control(init.sol=scadfun), tune=cv.control(rep=5),dfmax=30) # Check fit on test data testrisk <- predict(cv.ssc,X[test,],type="lp") plot(survfit(surv[test,]~I(testrisk<median(testrisk))),main="Low versus high risk")