Adaptive Minimax Estimation over Sparse -Hulls
Given a dictionary of initial estimates of the unknown true regression function, we aim to construct linearly aggregated estimators that target the best performance among all the linear combinations under a sparse -norm () constraint on the linear coefficients. Besides identifying the optimal rates of aggregation for these -aggregation problems, our multi-directional (or adaptive) aggregation strategies by model mixing or model selection achieve the optimal rates simultaneously over the full range of for general and upper bound of the -norm. Both random and fixed designs, with known or unknown error variance, are handled, and the -aggregations examined in this work cover major types of aggregation problems previously studied in the literature. Consequences on minimax-rate adaptive regression under -constrained coefficients () are also provided. Our results show that the minimax rate of -aggregation () is basically determined by an effective model size that depends on , , , and the sample size in an easily interpretable way based on classical model selection theory that deals with a large number of models. In addition, in the fixed design case, the model selection approach is seen to yield optimal rate of convergence not only in expectation but also in probability. In contrast, the model mixing approach can have leading constant one in front of the target risk in the oracle inequality while not offering optimality in probability.
View on arXiv