**************************************************************************************************************
Nonparametric double additive cure survival models: an application to the estimation of the nonlinear 
effect of age at first parenthood on fertility progression.
 
Statistical Modelling - 2018

Vincent Bremhorst - Michaela Kreyenfeld - Philippe Lambert
Corresponding author : vincent.bremhorst@uclouvain.be 

Readme file - Help for the use of the R code. 

**************************************************************************************************************

REQUIREMENT : 
-----------

Please, cite the paper when publishing analysis using this code. 


REMARKS : 
-------

For this version of the code : 

1) At least one categorical covariate (defined as factor in R) is needed in each regression part
2) At least one continuous covariate needs to be modelled in a flexible way in each regression part

==> If one of these two conditions are not respected, the main function will return an error message. 

3) Depending on the sample size, the computation time can take some time. 

4) The code can only be run on Windows or on Linux.

 
The data : 
--------

The data used in this paper are available (only for liscenced SOEP-users) at : 
https://www.diw.de/de/diw_01.c.361286.de/archiv.html

A simulated dataset is available with this code (ExampleData.Rdata). 

***************************************************************************************************************

SM_BremhorstetAl2017_mainFunction
----------------------------------

Input : 

* study_name : character variable with the name of the study. 

* obstime : vector of the observed survival time. 

* Event : vector of the event indicator (1 if the event is observed ; 0 if the subject is right censored). 

* Cure : Covariates influencing the probability of the event. 
MUST BE DEFINED AS : list(Categorical = list(...), Continuous = list(...)) 

* labels_Cure : Labels of the covariates influencing the probability of the event. 
MUST BE DEFINED AS : list(Categorical = list(...), Continuous = list(...)) 

* Cox : Covariates influencing the timing of the event. 
MUST BE DEFINED AS : list(Categorical = list(...), Continuous = list(...)) 

* labels_Cox : Labels of the covariates influencing the timing of the event. 
MUST BE DEFINED AS : list(Categorical = list(...), Continuous = list(...)) 

* iteration : length of the posterior chain (MUST BE an integer)

* burnin : burnin of the posterior chain (MUST BE an integer < than iteration)

!!!! DO NOT MODIFY THE DEFAULT VALUES !!!!
* nknots_BD = 17
* degree_BD = 3
* order_BD = 3 : order of the roughness penalty

* nknots_Cov = 7
* degree_Cov = 3
* order_Cov = 2 : order of the roughness penalty

OUTPUT : 

* Create a folder named "Study_name-Convergence checks" containing all the information for checking the convergence of the MCMC algorithm. 

* Return the posterior sample of all the model parameters.
(use names(Res$PostChains) to obtained the name of each component of the list.)

* Return the estimates (defined as the posterior median), posterior standard deviation and the HPD intervals (90%-95%-99%) of each regression parameters.  
(use Res$Nresults)

* Create the graphes illustrating the non-linear estimates of the effects of the continuous covariates.

 
******************************************************************************************************************************************************************

Post.proba
-----------

INPUT : 

* X : Given value for the categorical variables in the probability model. 
(!! The first component of X must be equal to 1 --> it is the intercept.)

* Continuous_variables : list of the continuous covariates in the probability model. 
!! The first one must be the variable for which the evolution of the probability is plotted.

* cont.ref : reference value for the other continuous covariates in the probability model.
!! To be specified only if there is more than one continuous covariates in the probability model. 
!! Continuous covariates must be in the same order as in Continuous_variables (counting from component 2). 

* alpha = Res$PostChains[["alphaPost"]] # DO NOT MODIFY (Posterior sample of the regression parameters in the probability model.)

* phi : Posterior sample of the spline parameters related to the continuous variable for which the evolution of the probability is plotted.    

* phi_others : list of the posterior sample of the splines parameters related to the other continuous covariate in the probability model.
!! To be specified only if there is more than one continuous covariates in the probability model. 
!! Must be in the same order as in Continuous_variables (counting from component 2). 

* conf.int : levels of the global confidence interval (by default 0.95)

!!!! DO NOT MODIFY THE DEFAULT VALUES !!!!
nknots_Cov = 7 
degree_Cov = 3 

* ylab : label of the y-axis 

* xlab : label of the x-axis

* main : Title of the graph 


OUTPUT : 
Create the graph of the evolution of the probability to have the event with a given continuous covariates for the given value of the other covariates. 

