4.7 Article

Linear regression with an observation distribution model

Journal

JOURNAL OF GEODESY
Volume 95, Issue 2, Pages -

Publisher

SPRINGER
DOI: 10.1007/s00190-021-01484-x

Keywords

Regression; Least-squares; Estimation; Observation distribution; Normal equations

Funding

  1. Natural Sciences and Engineering Research Council of Canada [RGPIN-2018-03775]

Ask authors/readers for more resources

Despite the complexity of the real world, linear regression is still important for estimating parameters in modeling physical relationships between variables. This paper introduces a new methodology to model the distribution of observations for linear regression in order to predict parameter precision before data collection. The proposed methodology shows good agreement between empirical and predicted precisions for both simulated and real datasets, with discrepancies mainly attributed to finite sample size.
Despite the high complexity of the real world, linear regression still plays an important role in estimating parameters to model a physical relationship between at least two variables. The precision of the estimated parameters, which can usually be considered as an indicator of the solution quality, is conventionally obtained from the inverse of the normal equations matrix for which intensive computation is required when the number of observations is large. In addition, the impacts of the distribution of the observations on parameter precision are rarely reported in the literature. In this paper, we propose a new methodology to model the distribution of observations for linear regression in order to predict the parameter precision prior to actual data collection and performing the regression. The precision analysis can be readily performed given a hypothesized data distribution. The methodology has been verified with several simulated and real datasets. The results show that the empirical and model-predicted precisions match very well, with discrepancies of up to 6% and 3.4% for simulated and real datasets, respectively. Simulations demonstrate that these differences are simply due to finite sample size. In addition, simulation also demonstrates the relative insensitivity of the method to noise in the independent regression variables that causes deviations from the data distribution function. The proposed methodology allows straightforward prediction of the parameter precision based on the distribution of the observations related to their numerical limits and geometry, which greatly simplify design procedures for various experimental setups commonly involved in geodetic surveying such as LiDAR data collection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available