Abstract
Purpose Individual-level simulation models often require sampling times to events, however efficient parametric distributions for many processes may often not exist. For example, time to death from life tables cannot be accurately sampled from existing parametric distributions. We propose an efficient nonparametric method to sample times to events that does not require any parametric assumption on the hazards.
Methods We developed a nonparametric sampling (NPS) approach that simultaneously draws multiple time-to-event samples from a categorical distribution. This approach can be applied to univariate and multivariate processes. We discretize the entire period into equal-length time intervals and then derived the interval-specific probabilities. The times to events can then be used directly in individual-level simulation models. We compared the accuracy of our approach in sampling time-to-events from common parametric distributions, including exponential, gamma, and Gompertz. In addition, we evaluated the method’s performance in sampling age to death from US life tables and sampling times to events from parametric baseline hazards with time-dependent covariates.
Results The NPS method estimated similar expected times to events from 1 million draws for the three parametric distributions, 100,000 draws for the homogenous cohort, 200,000 draws from the heterogeneous cohort, and 1 million draws for the parametric distributions with time-varying covariates, all in less than a second.
Conclusion Our method produces accurate and computationally efficient samples for time-to-events from hazards without requiring parametric assumptions.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
Dr. Alarid-Escudero is supported by the grant U01CA253913, Dr. Jalal is supported by a Canada Research Chair, and Drs. Alarid-Escudero and Jalal are supported by the grant U01CA265750 from the National Cancer Institute (NCI) as part of the Cancer Intervention and Surveillance Modeling Network (CISNET)
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
The affiliation of the first author was updated. We also provided a more detailed and formal description on the use of the cumulative hazard function, H(t), for the proposed nonparametric sampling (NPS) method for times to events. We also included a more extensive benchmarking of the NPS method against other parametric sampling techniques. For example, we included the expected value and the average execution time of status quo parametric sampling methods for each parametric distribution mentioned in the manuscript (i.e., rexp, rgamma, and rlnorm) to sample times to events in R. To do so, we extended Table 1 to incorporate the results and mean execution time of status quo parametric sampling methods in R. Besides, we updated the Supplementary Materials. These contain the code to implement the proposed non parametric sampling apporach, and replicate all the examples using R and the first three examples using python. The code to replicate this examples and to generate the Supplementary material is publicly availble in a Github repository (https://github.com/DARTH-git/NPS_time_to_event).
Data Availability
The code and data used to generate all the examples, are available online using the data availability link.