Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies

Hongtao Zhang; Haibo Zhou; David Couper; Jianwen Cai

doi:doi:10.11648/j.ajam.20210906.11

| Peer-Reviewed

Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies

Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai

Published in American Journal of Applied Mathematics (Volume 9, Issue 6)

Received: 8 June 2021 Accepted: 2 December 2021 Published: 24 December 2021

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.

Published in	American Journal of Applied Mathematics (Volume 9, Issue 6)
DOI	10.11648/j.ajam.20210906.11
Page(s)	192-210
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2021. Published by Science Publishing Group

Keywords

Case-cohort Study, Multiple Disease Outcomes, Survival Analysis

References

[1]	R. L. Prentice, “A case-cohort design for epidemiologic cohort studies and disease prevention trials,” Biometrika, vol. 73, no. 1, pp. 1–11, 1986.
[2]	N. Breslow and J. Wellner, “Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression,” Scand. J. Statist., vol. 34, no. 1, pp. 86–102, 2007.
[3]	S. G. Self and R. L. Prentice, “Asymptotic Distribution Theory and Efficiency Results for Case-Cohort Studies,” The Annals of Statistics, vol. 16, no. 1, pp. 64–81, 1988.
[4]	W. E. Barlow, “Robust variance estimation for the case- cohort design.,” Biometrics, vol. 50, no. 4, pp. 1064–72, 1994.
[5]	K. Chen and S.-H. Lo, “Case-cohort and case-control analysis with Cox’s model,” Biometrika, vol. 86, no. 4, pp. 755–764, 1999.
[6]	O. Borgan, B. Langholz, S. O. Samuelsen, L. Goldstein, and J. Pogoda, “Exposure stratified case-cohort designs,” Lifetime Data Analysis, vol. 6, no. 1, pp. 39–58, 2000.
[7]	S. Kang and J. Cai, “Marginal hazards model for case-cohort studies with multiple disease outcomes,” Biometrika, vol. 96, no. 4, pp. 887–901, 2009.
[8]	S. Kim, J. Cai, and W. Lu, “More efficient estimators for case-cohort studies,” Biometrika, vol. 100, no. 3, p. 695, 2013.
[9]	J. Ding, T.-S. Lu, J. Cai, and H. Zhou, “Recent progresses in outcome-dependent sampling with failure time data,” Lifetime data analysis, vol. 23, no. 1, pp. 57–82, 2017.
[10]	C. M. Ballantyne, R. C. Hoogeveen, H. Bang, J. Coresh, A. R. Folsom, G. Heiss, and A. R. Sharrett, “Lipoprotein- associated phospholipase a2, high-sensitivity c-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the atherosclerosis risk in communities (aric) study,” Circulation, vol. 109, no. 7, pp. 837–842, 2004.
[11]	C. M. Ballantyne, R. C. Hoogeveen, H. Bang, J. Coresh, A. R. Folsom, L. E. Chambless, M. Myerson, K. K. Wu, A. R. Sharrett, and E. Boerwinkle, “Lipoprotein- associated phospholipase A2, high-sensitivity C-reactive protein, and risk for incident ischemic stroke in middle- aged men and women in the Atherosclerosis Risk in Communities (ARIC) study.,” Arch Intern Med, vol. 165, pp. 2479–2484, 2005.
[12]	M. Kulich and D. Y. Lin, “Improving the Efficiency of Relative-Risk Estimation in Case-Cohort Studies,” Journal of the American Statistical Association, vol. 99, no. 467, pp. 832–844, 2004.
[13]	N. E. Breslow, T. Lumley, C. M. Ballantyne, L. E. Chambless, and M. Kulich, “Improved Horvitz- Thompson Estimation of Model Parameters from Two- phaseStratifiedSamples: ApplicationsinEpidemiology,” Statistics in Biosciences, vol. 1, no. 1, pp. 32–49, 2009.
[14]	N. E. Breslow, T. Lumley, C. M. Ballantyne, L. E. Chambless, and M. Kulich, “Using the whole cohort in the analysis of case-cohort data,” American Journal of Epidemiology, vol. 169, no. 11, pp. 1398–1405, 2009.
[15]	J. Cai and D. Zeng, “Power calculation for case-cohort studies with nonrare events,” Biometrics, vol. 63, no. 4, pp. 1288–1295, 2007.
[16]	J. D. Kalbfleisch and R. L. Prentice, The Statistical Analysis of Failure Time Data. John Wiley & Sons, 2002.
[17]	J. Cai and R. L. Prentice, “Estimating equations for hazard ratio parameters based on correlated failure time data,” Biometrika, vol. 82, no. 1, pp. 151–164, 1995.
[18]	C. F. Spiekerman and D. Y. Lin, “Marginal Regression Models for Multivariate Failure Time Data,” Journal of the American Statistical Association, vol. 93, no. 443, p. 1164, 1998.
[19]	D. Clayton and J. Cuzick, “Multivariate Generalizations of the Proportional Hazards Model,” Journal of the Royal Statistical Society. Series A, vol. 148, no. 2, pp. 82–117, 1985.
[20]	W. Hu, J. Cai, and D. Zeng, “Sample size/power calculation for stratified case–cohort design,” Statistics in medicine, vol. 33, no. 23, pp. 3973–3985, 2014.
[21]	O. Saarela, S. Kulathinal, E. Arjas, and E. Läärä, “Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives,” Statistics in medicine, vol. 27, no. 28, pp. 5991–6008, 2008.
[22]	N. C. Støer and S. O. Samuelsen, “Comparison of estimators in nested case-control studies with multiple outcomes,” Lifetimedataanalysis, vol.18, no.3, pp.261– 283, 2012.
[23]	Y. Yan, H. Zhou, and J. Cai, “Improving efficiency of parameter estimation in case-cohort studies with multivariate failure time data,” Biometrics, vol. 73, no. 3, pp. 1042–1052, 2017.
[24]	L. Qi, C. Y. Wang, and R. L. Prentice, “Weighted Estimators for Proportional Hazards Regression With Missing Covariates,” Journal of the American Statistical Association, vol. 100, no. 472, pp. 1250–1263, 2005.
[25]	S. Kang, J. Cai, and L. Chambless, “Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk in Communities (ARIC) study,” Biostatistics, vol. 14, no. 1, pp. 28–41, 2012.
[26]	J. Hajek, “Limiting distributions in simple random sampling from a finite population,” Publications of the Mathematics Institute of the Hungarian Academy of Science, vol. 5, no. 361, p. 74, 1960.
[27]	A. W. Van der vaart and J. Wellner, Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer New York, 2012.
[28]	D. Y. Lin, “On fitting Cox’s proportional hazards models to survey data,” Biometrika, vol. 87, no. 1, pp. 37–47, 2000.
[29]	R. V. Foutz, “On the Unique Consistent Solution to the Likelihood Equations,” Journal of the American Statistical Association, vol. 72, no. 357, pp. 147–148, 1977.

Cite This Article

Plain Text BibTeX RIS

APA Style

Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. (2021). Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. American Journal of Applied Mathematics, 9(6), 192-210. https://doi.org/10.11648/j.ajam.20210906.11

Copy | Download

ACS Style

Hongtao Zhang; Haibo Zhou; David Couper; Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am. J. Appl. Math. 2021, 9(6), 192-210. doi: 10.11648/j.ajam.20210906.11

Copy | Download

AMA Style

Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am J Appl Math. 2021;9(6):192-210. doi: 10.11648/j.ajam.20210906.11

Copy | Download

@article{10.11648/j.ajam.20210906.11,
  author = {Hongtao Zhang and Haibo Zhou and David Couper and Jianwen Cai},
  title = {Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies},
  journal = {American Journal of Applied Mathematics},
  volume = {9},
  number = {6},
  pages = {192-210},
  doi = {10.11648/j.ajam.20210906.11},
  url = {https://doi.org/10.11648/j.ajam.20210906.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajam.20210906.11},
  abstract = {The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies
AU  - Hongtao Zhang
AU  - Haibo Zhou
AU  - David Couper
AU  - Jianwen Cai
Y1  - 2021/12/24
PY  - 2021
N1  - https://doi.org/10.11648/j.ajam.20210906.11
DO  - 10.11648/j.ajam.20210906.11
T2  - American Journal of Applied Mathematics
JF  - American Journal of Applied Mathematics
JO  - American Journal of Applied Mathematics
SP  - 192
EP  - 210
PB  - Science Publishing Group
SN  - 2330-006X
UR  - https://doi.org/10.11648/j.ajam.20210906.11
AB  - The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.
VL  - 9
IS  - 6
ER  -

Copy | Download

Author Information

Hongtao Zhang

Bristol Myers Squibb, Berkeley Heights, New Jersey, USA
Haibo Zhou

Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
David Couper

Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
Jianwen Cai

Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. (2021). Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. American Journal of Applied Mathematics, 9(6), 192-210. https://doi.org/10.11648/j.ajam.20210906.11

Copy | Download

ACS Style

Hongtao Zhang; Haibo Zhou; David Couper; Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am. J. Appl. Math. 2021, 9(6), 192-210. doi: 10.11648/j.ajam.20210906.11

Copy | Download

AMA Style

Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am J Appl Math. 2021;9(6):192-210. doi: 10.11648/j.ajam.20210906.11

Copy | Download

@article{10.11648/j.ajam.20210906.11,
  author = {Hongtao Zhang and Haibo Zhou and David Couper and Jianwen Cai},
  title = {Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies},
  journal = {American Journal of Applied Mathematics},
  volume = {9},
  number = {6},
  pages = {192-210},
  doi = {10.11648/j.ajam.20210906.11},
  url = {https://doi.org/10.11648/j.ajam.20210906.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajam.20210906.11},
  abstract = {The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies
AU  - Hongtao Zhang
AU  - Haibo Zhou
AU  - David Couper
AU  - Jianwen Cai
Y1  - 2021/12/24
PY  - 2021
N1  - https://doi.org/10.11648/j.ajam.20210906.11
DO  - 10.11648/j.ajam.20210906.11
T2  - American Journal of Applied Mathematics
JF  - American Journal of Applied Mathematics
JO  - American Journal of Applied Mathematics
SP  - 192
EP  - 210
PB  - Science Publishing Group
SN  - 2330-006X
UR  - https://doi.org/10.11648/j.ajam.20210906.11
AB  - The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.
VL  - 9
IS  - 6
ER  -

Copy | Download