Least squares after model selection in high dimensional sparse models. We consider highdimensional binary classification by sparse logistic regression. Asymptotic analysis of highdimensional lad regression with lasso xiaoli gao and jian huang oakland university and university of iowa abstract. In this article, we deal with sparse highdimensional multivariate regression models. Sequential model averaging for high dimensional linear.
Finally, the ultrahigh dimensional assumption includes the high dimensional setting say p onb for some b 0 as a special case. Much insight from this work can be gained to understand high dimensional or sparse regression and it comes as no surprise that donoho and johnstone have made the rst contributions on this topic in the early nineties. In such models, the overall number of regressors p is very large, possibly much larger than the sample size n. This paper considers the task of building efficient regression models for sparse multivariate analysis of high dimensional data sets, in particular it focuses on cases where the numbers q of responses y y k,1. In highdimensional statistical modeling, it is a fundamental problem to identify important explanatory variables. Bayesian models for sparse regression analysis of high. We consider variable selection in highdimensional sparse multiresponse linear regression models, in which a qdimensional response vector has a linear relationship with a pdimensional covariate vector through a sparse coefficient matrix \b\in rp\times q\. We study the asymptotic properties of the adaptive lasso estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. Pdf adaptive lasso for sparse highdimensional regression. Horowitz2, and shuangge ma3 1university of iowa, 2northwestern university, 3yale university summary. Asymptotic properties of bridge estimators in sparse high dimensional regression models jian huang1, joel l. Highdimensional sparse econometric models, 2010, advances in economics and econometrics, 10th world congress.
Partial correlation estimation by joint sparse regression models. Boosting methods for variable selection in high dimensional. We show that, if a reasonable initial estimator is available, under ap. Hence, sma is directly applicable to the high dimensional model. Such assumption is crucial in ensuring the identifiability of the true underlying sparse model especially. We consider variable selection using the adap tive lasso, where the l1 norms in the penalty are reweighted by datadependent weights. It is observed that our approach has better prediction performance for highly sparse high dimensional linear regression models. Inference for highdimensional sparse econometric models. Asymptotic properties of bridge estimators in sparse high dimensional regression models. Highdimensional time series, stochastic regression, vector au toregression. A variable screening procedure via correlation learning was proposed by fan and lv 2008 to reduce dimensionality in sparse ultra high dimensional models. Regularized estimation in sparse highdimensional time series models. Estimation of regression functions via penalization and selection 3.
To address this issue, we further extend the correlation learning to marginal nonparametric learning. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s high. Minjing tao asymptotic properties of bridge estimators 1 45. We consider linear, high dimensional sparse hds regression models in econometrics. Asymptotic analysis of highdimensional lad regression. Sparse highdimensional regression ams 2000 subject classi. The limits of dimensionality that regularization methods can handle, the role of penalty functions, and their statistical properties are detailed. Penalized regression, highdimensional data, variable selection, asymptotic normality, oracle property. Adaptive lasso for sparse highdimensional regression models 1607 of appropriate dimension with all components zero. We consider variable selection using the adaptive lasso, where the l1 norms in the penalty are reweighted by datadependent weights.
We consider variable selection in high dimensional sparse multiresponse linear regression models, in which a q dimensional response vector has a linear relationship with a p dimensional covariate vector through a sparse coefficient matrix \b\in rp\times q\. It generates a sequence of solutions iteratively, based on support detection using primal and dual information and root. Asymptotic properties of bridge estimators in sparse highdimensional regression models jian huang joel horowitz shuangge ma presenter. Sparse model identification and learning for ultrahigh. This work proposes new inference methods for the estimation of a regression coe. We present a new class of models for highdimensional nonparametric regression and classi. Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. Our proposal estimates the central subspace directly and performs variable selection simultaneously. Introduction we consider linear, high dimensional sparse hds regression models in econometrics. L1penalized quantile regression in high dimensional. Much insight from this work can be gained to understand highdimensional or sparse regression and it comes as no surprise that donoho and johnstone have made the rst contributions on this topic in the early nineties. Even when the true model is linear, the marginal regression can be highly nonlinear.
Asymptotic properties of bridge estimators in sparse highdimensional regression models. For linear regression models, many penalization methods have been proposed to conduct variable selection and estimation, and much e. We show that, if a reasonable initial estimator is available, then under appropriate. Highdimensional sparse models hdsm models motivating examples 2. High dimensional sparse econometric models, 2010, advances in economics and econometrics, 10th world congress. Noise accumulation is a common phenomenon in high dimensional prediction. We provide a novel and to the best of our knowledge, the first algorithm for high dimensional sparse regression with corruptions in explanatory andor response variables.
In a standard linear model, we have at our disposal xi, yi supposed to be linked with. The underlying model is the same as in equation 1, but we impose a sparsity constraint on the index set j. Generalized ridge regression estimator in high dimensional. This paper focuses on the simultaneous sparse model identification and learning for ultrahighdimensional aplms which strikes a delicate balance between the simplicity of the standard linear regression models and the flexibility of the additive regression models. The lasso is an attractive approach to variable selection in sparse, highdimensional regression models. Nonparametric independence screening in sparse ultrahigh. Nonasymptotic analysis of semiparametric regression models with high dimensional parametric coefficients zhu, ying, annals of statistics, 2017.
Asymptotic properties of bridge estimators in sparse highdimensional regression models jian huang1, joel l. Zhao and yu 27 have shown that lasso is variable selection consistent for nonrandom highdimensional regressors under an irrepresentable condition ic on the sample covariance matrix and regression coef. Adaptive lasso for sparse highdimensional regression models. Variable selection in highdimensional sparse multiresponse. Horowitz2 and shuangge ma university of iowa, northwestern university and yale university we study the asymptotic properties of bridge estimators in sparse, highdimensional, linear regression models when the number of covariates may. The main assumption is that the pdimensional parameter vector is sparse with many components being exactly zero or negligibly small, and each nonzero component stands for the contribution of an important predictor. We are particularly interested in the use of bridge estimators to distinguish between covariates whose coefficients are zero and covariates whose coefficients. In this paper, we propose a convex formulation for sparse sliced inverse regression in the highdimensional setting by adapting techniques from sparse canonical correlation analysis vu et al. Partial correlation estimation by joint sparse regression models jie peng, pei wang, nengfeng zhou, and ji zhu in this article, we propose a computationally efficient approachspace sparse partial correlation estimationfor selecting nonzero partial correlations under the. W e consider variable selection using the adaptiv e lasso, where. Lassotype sparse regression and highdimensional gaussian graphical models by xiaohui chen m. Highdimensional classification by sparse logistic regression.
Big data lecture 2 high dimensional regression with the lasso. High dimensional structured quantile regression vidyashankar sivakumar 1arindam banerjee abstract quantile regression aims at modeling the conditional median and quantiles of a response variable given certain predictor variables. Estimation and inference with outline for econometric theory of big data part i. We propose a model feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the nonasymptotic bounds for the resulting misclassification excess risk. We study the asymptotic properties of adaptive lasso estimators in sparse, highdimensional, linear regression models when the. Robust inference in high dimensional approximately sparse. Lassotype sparse regression and highdimensional gaussian. The hds regression model allows for a large number of regressors, p, which is possibly much larger. We consider highdimensional models where the number of. These methods typically model the data using the sum of a small number of univariate or very low dimensional functions.
In this work we consider the problem of linear quantile regression in high dimensions where the num. Asymptotic properties of bridge estimators in sparse high dimensional regression models jian huang joel horowitz shuangge ma presenter. We study the asymptotic properties of bridge estimators in sparse, high. Olii massachusettsinstituteoftechnology departmentofeconomics workingpaperseries penalizedquantileregressioninhigh. Bayesian adaptive elasticnet for high dimensional sparse quantile regression models. Pdf asymptotic properties of bridge estimators in sparse. Partial correlation estimation by joint sparse regression models jie peng, pei wang, nengfeng zhou, and ji zhu in this article, we propose a computationally efficient approachspace sparse partial correlation estimationfor selecting nonzero partial correlations under the highdimensionlowsamplesize setting. Scaled sparse linear regression biometrika oxford academic. Estimation of regression functions via penalization andthe framework two examplesselection 3. We study the asymptotic properties of bridge estimators in sparse, highdimensional, linear regression models when the number of covariates may increase to infinity with the sample size. Nonasymptotic analysis of semiparametric regression models with highdimensional parametric coefficients zhu, ying, annals of statistics, 2017.
Recent developments of theory, methods, and implementations in penalized least squares and penalized likelihood methods are highlighted. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. We propose a consistent procedure for the purpose of identifying the nonzeros in b. Although it is well known in regression problems, explicit theoretical quanti. The models distinguish themselves from ordinary multivariate regression models in two aspects. High dimensional sparse models arise in situations where many regressors or series terms are available and the regression function is wellapproximated by a parsimonious, yet unknown set of regressors. Recent developments in theory, methods, and implementations in penalized leastsquares and penalized likelihood methods are highlighted. With such data, the dimension of the covariate vector can be much larger than the sample size. For example, in a linear regression model with noise variance. We consider high dimensional binary classification by sparse logistic regression. Pdf generalized ridge regression estimator in high. Least squares after model selection in highdimensional. In this article, we deal with sparse high dimensional multivariate regression models. Least squares after model selection in highdimensional sparse models.
These methods typically model the data using the sum of a small number of univariate or very lowdimensional functions. Sparse highdimensional models in economics princetons orfe. Journal of the royal statistical society, series b, statistical methodology. The performance of our new estimators is compared with commonly used estimators in terms of predictive accuracy and errors in variable selection. Envelope models for parsimonious and efficient multivariate linear regression. Asymptotic properties of bridge estimators in sparse high. Sparse modeling has been widely used to deal with high dimensionality. Lassotype sparse regression and high dimensional gaussian graphical models by xiaohui chen m. Pdf we study the asymptotic properties of the adaptive lasso estimators in sparse, highdimensional, linear regression models when the number of. Partial correlation estimation by joint sparse regression. Our motivation comes from studies that try to correlate a certain phenotype with highdimensional genomic data. Horowitz2, and shuangge ma3 1department of statistics and actuarial science, university of iowa 2department of economics, northwestern university 3department of biostatistics, university of washington march 2006 the university of iowa department of statistics. These are applied in sequence to accommodate high dimensionality and sparseness and facilitate managerial interpretation. Horowitz2 and shuangge ma university of iowa, northwestern university and yale university we study the asymptotic properties of bridge estimators in sparse, high dimensional, linear regression models when the number of covariates may.
Pdf bayesian adaptive elasticnet for high dimensional. They show that replacing the standard inner product in matching pursuit with a trimmed version, one can recover from an. We study the asymptotic properties of the adaptive lasso estimators in sparse, high dimensional, linear regression models when the number of covariates may increase with the sample size. Work in first analyzed high dimensional sparse regression with arbitrary corruptions in covariates.
A twostage sequential conditional selection approach to. For instance, in gwas, our primary problem of interest is to. In this paper, we consider quantile regression in highdimensional sparse models hdsms. A variable screening procedure via correlation learning was proposed by fan and lv 2008 to reduce dimensionality in sparse ultrahighdimensional models. A road to classification in high dimensional space. Generalized ridge regression estimator in high dimensional sparse regression models article pdf available august 2018 with 62 reads how we measure reads. In this paper we investigate sparse additive models spams, which extend the advantages of sparse linear models to the additive nonparametric setting. Our methods combine ideas from sparse linear modeling and additive nonparametric regression. This article is about estimation and inference methods for high dimensional sparse hds regression models in econometrics.
437 822 111 1153 84 720 681 48 1179 1425 107 688 527 985 49 25 316 845 272 773 761 152 23 339 1323 287 1474 198 1415 1344 1299 1507 1433 835 1434 274 873 515 864 442 1458 249 1042 1082