☆ 4.5 Article

Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS (2023)

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS

Volume -, Issue -, Pages -

Publisher

TAYLOR & FRANCIS INC

DOI: 10.1080/10618600.2023.2204130

Keywords

Fixed mini-batch; Gradient descent; Learning rate scheduling; Random shuffling; Stochastic gradient descent

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study presents a fixed mini-batch gradient descent (FMGD) algorithm for optimizing problems with massive datasets. FMGD divides the sample into non-overlapping partitions and keeps them fixed throughout the algorithm. By calculating the gradients on each fixed mini-batch sequentially, the computation cost for each iteration is significantly reduced. This makes FMGD computationally efficient and practically feasible.

We study here a fixed mini-batch gradient decent (FMGD) algorithm to solve optimization problems with massive datasets. In FMGD, the whole sample is split into multiple non-overlapping partitions. Once the partitions are formed, they are then fixed throughout the rest of the algorithm. For convenience, we refer to the fixed partitions as fixed mini-batches. Then for each computation iteration, the gradients are sequentially calculated on each fixed mini-batch. Because the size of fixed mini-batches is typically much smaller than the whole sample size, it can be easily computed. This leads to much reduced computation cost for each computational iteration. It makes FMGD computationally efficient and practically more feasible. To demonstrate the theoretical properties of FMGD, we start with a linear regression model with a constant learning rate. We study its numerical convergence and statistical efficiency properties. We find that sufficiently small learning rates are necessarily required for both numerical convergence and statistical efficiency. Nevertheless, an extremely small learning rate might lead to painfully slow numerical convergence. To solve the problem, a diminishing learning rate scheduling strategy can be used. This leads to the FMGD estimator with faster numerical convergence and better statistical efficiency. Finally, the FMGD algorithms with random shuffling and a general loss function are also studied.

Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS

Publisher

TAYLOR & FRANCIS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

Journal

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS

Publisher

TAYLOR & FRANCIS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper