Joint feature screening method for ultrahigh dimensional censored data

PAN Jing, CHAI Hongfeng, SUN Quan, ZHOU Yong

Systems Engineering - Theory & Practice ›› 2023, Vol. 43 ›› Issue (1) : 169-190.

PDF(563 KB)
PDF(563 KB)
Systems Engineering - Theory & Practice ›› 2023, Vol. 43 ›› Issue (1) : 169-190. DOI: 10.12011/SETP2021-0500

Joint feature screening method for ultrahigh dimensional censored data

  • PAN Jing1, CHAI Hongfeng2, SUN Quan1,3, ZHOU Yong4
Author information +
History +

Abstract

For ultra-high-dimensional censored data, feature screening can be performed to remove noise in big data, and classical statistic analysis can be applied after that. This paper proposes a robust partial correlation coefficient for feature screening, and introduces an inverse probability weighting method to deal with censoring. Based on that, a new joint feature screening method is developed. By incorporating the information of the entire conditional distribution of the failure time, our method can depict the relationship between the response and covariates comprehensively. Compared with the traditional Pearson partial correlation coefficient, this measurement is robust to outliers, heavy-tailed distribution and heteroscedasticity. Moreover, the joint feature screening method proposed based on this metric eliminates the interference caused by the correlation between the covariates through the projection effect, so as to reduce the false negative errors, false positive errors and tackle the problem of collinearity of covariates. We establish the sure screening property of our method and give the details of the iterative algorithm. The competence of our method is further confirmed through comprehensive simulation studies and a real data example.

Key words

ultrahigh dimensional censored data / feature screening / partial correlation coefficient / inverse probability weighting estimation / robustness

Cite this article

Download Citations
PAN Jing , CHAI Hongfeng , SUN Quan , ZHOU Yong. Joint feature screening method for ultrahigh dimensional censored data. Systems Engineering - Theory & Practice, 2023, 43(1): 169-190 https://doi.org/10.12011/SETP2021-0500

References

[1] Rosenwald A, Wright G, Chan W, et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma[J]. New England Journal of Medicine, 2002, 346(25):1937-1947.
[2] 杨世娟, 汪建均, 马义中, 等. 定时删失数据下的双响应曲面建模与优化[J]. 系统工程理论与实践, 2021, 41(9):2392-2403. Yang S J, Wang J J, Ma Y Z, et al. Dual response surface modeling and optimization with time-censored data[J]. Systems Engineering——Theory & Practice, 2021, 41(9):2392-2403.
[3] Fan J, Lü J. Sure independence screening for ultra-high dimensional feature space[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology), 2008, 70(5):849-911.
[4] Zhao S D, Li Y. Principled sure independence screening for Cox models with ultra-high-dimensional covariates[J]. Journal of Multivariate Analysis, 2012, 105(1):397-411.
[5] Gorst-Rasmussen A, Scheike T. Independent screening for single-index hazard rate models with ultrahigh dimensional features[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology), 2013, 75(2):217-245.
[6] Song R, Lu W, Ma S, et al. Censored rank independence screening for high-dimensional survival data[J]. Biometrika, 2014, 101(4):799-814.
[7] Wang H. Forward regression for ultra-high dimensional variable screening[J]. Journal of the American Statistical Association, 2009, 104(488):1512-1524.
[8] Cho H, Fryzlewicz P. High dimensional variable selection via tilting[J]. Journal of the Royal Statistical Society:Series B (Statistical Methodology), 2012, 74(3):593-622.
[9] He K, Kang J, Hong H G, et al. Covariance-insured screening[J]. Computational Statistics & Data Analysis, 2019, 132:100-114.
[10] Ma S, Li R, Tsai C L. Variable screening via quantile partial correlation[J]. Journal of the American Statistical Association, 2017, 112(518):650-663.
[11] Barut E, Fan J, Verhasselt A. Conditional Sure Independence Screening[J]. Journal of the American Statistical Association, 2016, 111(515):1266-1277.
[12] Lin L, Sun J. Adaptive conditional feature screening[J]. Computational Statistics & Data Analysis, 2016, 94:287-301.
[13] Hong H G, Kang J, Li Y. Conditional screening for ultra-high dimensional covariates with survival outcomes[J]. Lifetime Data Analysis, 2018, 24(1):45-71.
[14] Fang Y, Xu J. Joint variable screening in the censored accelerated failure time model[J]. Statistica Sinica, 2020, 30:467-485.
[15] Hall P, Li K C. On almost linearity of low dimensional projections from high dimensional data[J]. The annals of Statistics, 1993, 21(2):867-889.
[16] Pan J, Yu Y, Zhou Y. Nonparametric independence feature screening for ultrahigh-dimensional survival data[J]. Metrika, 2018, 81:821-847.
[17] 王霞, 付中昊, 洪永淼, 等. 基于非参数回归的金融传染检验[J]. 系统工程理论与实践, 2020, 40(6):1398-1418. Wang X, Fu Z H, Hong Y M, et al. Nonparametric regression based testing for financial contagion[J]. Systems Engineering——Theory & Practice, 2020, 40(6):1398-1418.
[18] Peng L, Fine J P. Competing risks quantile regression[J]. Publications of the American Statistical Association, 2009, 104(488):1440-1453.
[19] Zhu L P, Li L, Li R, Zhu L X. Model-free feature screening for ultrahigh-dimensional data[J]. Journal of the American Statistical Association, 2011, 106(496):1464-1475.
[20] Li H, Luan Y. Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data[J]. Bioinformatics, 2005, 21(10):2403-2409.
[21] Lu W L. Boosting method for nonlinear transformation models with censored survival data[J]. Biostatistics, 2008, 9(4):658-667.

Funding

National Key Research and Development Program (2021YFA1000101, 2021YFA1000102, 2021YFA 1000104); The Key Program of National Natural Science Foundation of China (71931004); National Natural Science Foundation of China (92046005); The Key Support Project of National Natural Science Foundation of China (92046024); The Integration Project of National Natural Science Foundation of China (92146002)
PDF(563 KB)

621

Accesses

0

Citation

Detail

Sections
Recommended

/