Journal
TEST
Volume 30, Issue 4, Pages 861-883Publisher
SPRINGER
DOI: 10.1007/s11749-020-00749-5
Keywords
Bandwidth selection; Big data; Kernel density estimation; Large sample size; Sampling bias
Categories
Funding
- MINECO through the European Regional Development Fund (ERDF) [MTM2017-82724-R]
- Xunta de Galicia (Grupos de Referencia Competitiva) through the European Regional Development Fund (ERDF) [ED431C-2016-015, ED431C-2020-14]
- Xunta de Galicia (Centro Singular de Investigacion de Galicia) through the European Regional Development Fund (ERDF) [ED431G/01]
- Xunta de Galicia (Centro de Investigacion del Sistema Universitario de Galicia) through the European Regional Development Fund (ERDF) [ED431G 2019/01]
- European Regional Development Fund (ERDF)
Ask authors/readers for more resources
This paper investigates nonparametric estimation for a large-sized sample subject to sampling bias, proposing a new method that integrates kernel density estimation and outperforms classical methods in mean estimation. Simulation results show the positive performance of the new method with suitable choices of smoothing parameters, as well as the influence of these parameters on the final estimator.
Nonparametric estimation for a large-sized sample subject to sampling bias is studied in this paper. The general parameter considered is the mean of a transformation of the random variable of interest. When ignoring the biasing weight function, a small-sized simple random sample of the real population is assumed to be additionally observed. A new nonparametric estimator that incorporates kernel density estimation is proposed. Asymptotic properties for this estimator are obtained under suitable limit conditions on the small and the large sample sizes and standard and non-standard asymptotic conditions on the two bandwidths. Explicit formulas are shown for the particular case of mean estimation. Simulation results show that the new mean estimator outperforms two classical ones for suitable choices of the two smoothing parameters involved. The influence of two smoothing parameters on the performance of the final estimator is also studied, exhibiting a striking limit behavior of their optimal values. The new method is applied to a real data set from the Telco Company Vodafone ES, where a bootstrap algorithm is used to select the smoothing parameter.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available