Modern machine learning, fueled by large datasets and complex models, faces a critical tension. The statistical principles underpinning learning (generalization, efficiency, robustness) often clash with the computational realities of optimization, especially in a resource constrained environment or when data exhibits inherent geometric structure. This work addresses the theme "Statistics Meets Optimization" by employing an optimization framework explicitly designed to leverage statistical data properties, particularly group invariances/equivariances common in real world data (e.g., spatial rotations in satellite imagery, temporal shifts in sensor data), to achieve significant gains in sample efficiency and convergence speed. We theoretically derive generalization bounds linking the exploitation of data geometry to reduced sample complexity. Empirically, we demonstrate the efficacy of our method on a challenging real world case study, i.e., on predicting crop yield anomalies in Delta State, Nigeria, using limited, noisy, and spatially heterogeneous satellite and meteorological data. Our optimizer achieved a significant performance with 40% less data compared to adaptive baselines (Adam, RMSProp), highlighting the practical impact of statistically-informed optimization, especially for regions facing data scarcity. This work provides a concrete bridge between statistical theory (data structure, efficiency) and optimization practice (algorithm design, scalability), demonstrating that geometry-aware algorithms can democratize effective ML for resource-limited applications.
| Published in | International Journal of Systems Science and Applied Mathematics (Volume 10, Issue 3) |
| DOI | 10.11648/j.ijssam.20251003.12 |
| Page(s) | 46-57 |
| Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
| Copyright |
Copyright © The Author(s), 2025. Published by Science Publishing Group |
Geometry-Aware Optimization, Statistical Learning, GeoAda, Crop Yield Prediction, Data Scarcity, Resource-constrained Machine Learning, Nigeria
| [1] | Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv: 1609.04747 |
| [2] |
GeeksforGeeks. (2025, July 23). Optimization algorithms in machine Learning.
https://www.geeksforgeeks.org/machine-learning/optimization-algorithms-in-machine-learning/ |
| [3] | Ogethakpo, A. J., Ovbije, O. G., and Apanapudor, J. S. (2025). The Evolving Landscape of Optimization: Current Trends and Future Directions. International Journal of Modern Science and Research Technology. 3(8), 108-115. |
| [4] | Karthick, K. (2024). Comprehensive Overview of Optimization Techniques in Machine Learning Training. Control Systems and Optimization Letters, 2(1) |
| [5] | Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. |
| [6] | Bartlett, P., Foster, D. J., and Telgarsky, M. (2017). Spectrally-normalized margin bounds for neural networks. Advances in Neural Information Processing System. vol. 30. |
| [7] |
Sabrepc Blog. (July 10, 2025). Overparameterization in AI Models - More Parameters Never Hurts.
https://www.sabrepc.com/blog/deep-learning-and-ai/overparameterization-in-ai-models |
| [8] | Du, S., Lee, J., Li, H., Wang, L., and Zhai, X. (2019). Gradient descent finds global minima of deep neural networks. International Conference on Machine Learning (ICML), PMLR 97: 1675-1685. |
| [9] | Maleki, F., Ovens, K., Gupta, R., Reinhold, C., Spatz, A., & Forghani, R. (2022). Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls. Radiology. Artificial intelligence, 5(1), e220028. |
| [10] |
Stephens (2023). Challenges and Solutions in Training Deep Learning Models. Data Science Society, Data. Platform.
https://www.datasciencesociety.net/challenges-and-solutions-in-training-deep-learning-models/ |
| [11] |
Hu, N. (2022). An Empirical Study of Neural Network Training Dynamics. Medium.
https://medium.com/@hu.niel92/an-empirical-study-of-neural-network-training-dynamics-3bf4b659359e |
| [12] |
Song, P. (2024). Machine Learning Scalability Issues: Challenges and Solutions. ML Journey.
https://mljourney.com/machine-learning-scalability-issues-challenges-and-solutions/ |
| [13] | Belkin, M., Hsu, D., Ma, S., and Mandal, S. (2019). Reconciling Modern Machine Learning Practice and the Classical Bias-Variance Trade-off. Proceedings of the National Academy of Science, U.S.A. 116(32) 15849-15854 |
| [14] | Neyshabur, B., Bhojanapalli, S., McAllester, D., and Srebro, N. (2017). Exploring generalization in deep learning. NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing System. 30, 5949-5958 |
| [15] | Pan, S. J. and Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering. 22(10), 1345–1359. |
| [16] | Nair, V. and Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. ICML’10: Proceedings of the 27th International Conference on Machine Learning., pp. 807-814. |
| [17] | Ogethakpo, A. J., & Ojobor, S. A. (2021). The Lotka-Volterra Predator-Prey Model with Disturbance. Nigerian Journal of Science and Environment. 19(2), 135-144. |
| [18] | Ogethakpo, A. J., Ogoegbulem, O., Apanapudor, J. S., & Sanubi, H. O. (2025). Stochastic Dynamics of Urban Predator-Prey Systems: Integrating Human Disturbance and Functional Responses. American Journal of Applied Mathematics. 13(4), 282-291. |
| [19] | Kingma, D. and Ba, J. (2015). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015). |
| [20] | T. Tieleman and G. Hinton. Lecture 6.5-RMSProp: Divide the Gradient by a Running Average of its Recent Magnitude. COURSERA: Neural Networks for Machine Learning. 4(2), 26-31. |
| [21] |
Symmetry. (2021, December 14). Department of Mathematics at UTSA,. Retrieved 09: 35, September 11, 2025 from
https://mathresearch.utsa.edu/wiki/index.php?title=Symmetry&oldid=4194 |
| [22] | Shao, H., Montasser, O., and Blum, A. (2022). A Theory of PAC Learnability under Transformation Invariances. Cornell University. arXiv. |
| [23] | Cohen, T. and Welling, M. (2016). Group Equivariant Convolutional Networks. International Conference on Machine Learning (ICML). 2990-2999. |
| [24] | Maurer, A. (2006). The Rademacher complexity of linear transformation classes. International Conference on Computational Learning Theory. pp. 65-78. |
| [25] | Ovbije, O. G., Oyiborhoro, M., and Akworigbe, A. H. Optimizing Agricultural Stock Portfolios in Ughelli Town Using Linear Programming. Faculty of Natural and Applied Sciences Journal of Mathematical Modeling and Numerical Simulation. 2(1), 108-114. |
| [26] | Lobell, D. B. (2013). The use of satellite data for crop yield gap analysis. Field Crops Research. 143, 56-64. |
| [27] | Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM. 60(6), 84-90. |
| [28] | Bronstein, M., Bruna, J., Cohen, T., and Velivckovi, P. (2021). Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv, abs/2104.13478. |
| [29] | Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York. |
| [30] | LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature. 521(7553), 436-444. |
| [31] | Kawaguchi, K., Bengio, Y., and Kaelbling, L. (2022). Generalization in Deep Learning. Mathematical Aspects of Deep Learning. Cambridge University Press.112-148. |
| [32] | Shalev-Shwartzc, S. and Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge Univ. Press. |
| [33] | Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., and Tang, P. T. P. (2017). On large-batch training for deep learning: Generalization gap and sharp minima. International Conference on Learning Representations. |
| [34] | Robbins, H. and Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics. 22(3), 400-407. |
| [35] | Bousquet, O. and Elisseeff, A. (2002). Stability and Generalization. Journal of Machine Learning Research. 2, 499-526. |
| [36] | Hardt, M., Recht, B., and Singer, Y. (2016). Train faster, generalize better: Stability of stochastic gradient descent. International Conference on Machine Learning. JMLR: W&CP 48, 1225-1234. |
| [37] | Roberts, D. A., Yaida, S. and Hanin, B. (2022). The Principles of Deep Learning Theory: An Effective Theory Approach to Understanding Neural Networks. Cambridge Univ. Press. |
APA Style
Joseph, O. A., Godspower, O. O., Olawale, A. A., Uyovwieyovwe, E. S., Gabriel, E. F., et al. (2025). Statistically Aware Optimization for Resource-constrained and Geometrically-rich Data: Nigerian Agricultural Case Study. International Journal of Systems Science and Applied Mathematics, 10(3), 46-57. https://doi.org/10.11648/j.ijssam.20251003.12
ACS Style
Joseph, O. A.; Godspower, O. O.; Olawale, A. A.; Uyovwieyovwe, E. S.; Gabriel, E. F., et al. Statistically Aware Optimization for Resource-constrained and Geometrically-rich Data: Nigerian Agricultural Case Study. Int. J. Syst. Sci. Appl. Math. 2025, 10(3), 46-57. doi: 10.11648/j.ijssam.20251003.12
@article{10.11648/j.ijssam.20251003.12,
author = {Ogethakpo Arhonefe Joseph and Ovbije Oghenekevwe Godspower and Adelakun Abdul Olawale and Ejakpovi Simeon Uyovwieyovwe and Emunefe Friday Gabriel and Akworigbe Hope Avwerosuoghene and Akporavware Oforhoraye Sunday},
title = {Statistically Aware Optimization for Resource-constrained and Geometrically-rich Data: Nigerian Agricultural Case Study
},
journal = {International Journal of Systems Science and Applied Mathematics},
volume = {10},
number = {3},
pages = {46-57},
doi = {10.11648/j.ijssam.20251003.12},
url = {https://doi.org/10.11648/j.ijssam.20251003.12},
eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijssam.20251003.12},
abstract = {Modern machine learning, fueled by large datasets and complex models, faces a critical tension. The statistical principles underpinning learning (generalization, efficiency, robustness) often clash with the computational realities of optimization, especially in a resource constrained environment or when data exhibits inherent geometric structure. This work addresses the theme "Statistics Meets Optimization" by employing an optimization framework explicitly designed to leverage statistical data properties, particularly group invariances/equivariances common in real world data (e.g., spatial rotations in satellite imagery, temporal shifts in sensor data), to achieve significant gains in sample efficiency and convergence speed. We theoretically derive generalization bounds linking the exploitation of data geometry to reduced sample complexity. Empirically, we demonstrate the efficacy of our method on a challenging real world case study, i.e., on predicting crop yield anomalies in Delta State, Nigeria, using limited, noisy, and spatially heterogeneous satellite and meteorological data. Our optimizer achieved a significant performance with 40% less data compared to adaptive baselines (Adam, RMSProp), highlighting the practical impact of statistically-informed optimization, especially for regions facing data scarcity. This work provides a concrete bridge between statistical theory (data structure, efficiency) and optimization practice (algorithm design, scalability), demonstrating that geometry-aware algorithms can democratize effective ML for resource-limited applications.},
year = {2025}
}
TY - JOUR T1 - Statistically Aware Optimization for Resource-constrained and Geometrically-rich Data: Nigerian Agricultural Case Study AU - Ogethakpo Arhonefe Joseph AU - Ovbije Oghenekevwe Godspower AU - Adelakun Abdul Olawale AU - Ejakpovi Simeon Uyovwieyovwe AU - Emunefe Friday Gabriel AU - Akworigbe Hope Avwerosuoghene AU - Akporavware Oforhoraye Sunday Y1 - 2025/11/26 PY - 2025 N1 - https://doi.org/10.11648/j.ijssam.20251003.12 DO - 10.11648/j.ijssam.20251003.12 T2 - International Journal of Systems Science and Applied Mathematics JF - International Journal of Systems Science and Applied Mathematics JO - International Journal of Systems Science and Applied Mathematics SP - 46 EP - 57 PB - Science Publishing Group SN - 2575-5803 UR - https://doi.org/10.11648/j.ijssam.20251003.12 AB - Modern machine learning, fueled by large datasets and complex models, faces a critical tension. The statistical principles underpinning learning (generalization, efficiency, robustness) often clash with the computational realities of optimization, especially in a resource constrained environment or when data exhibits inherent geometric structure. This work addresses the theme "Statistics Meets Optimization" by employing an optimization framework explicitly designed to leverage statistical data properties, particularly group invariances/equivariances common in real world data (e.g., spatial rotations in satellite imagery, temporal shifts in sensor data), to achieve significant gains in sample efficiency and convergence speed. We theoretically derive generalization bounds linking the exploitation of data geometry to reduced sample complexity. Empirically, we demonstrate the efficacy of our method on a challenging real world case study, i.e., on predicting crop yield anomalies in Delta State, Nigeria, using limited, noisy, and spatially heterogeneous satellite and meteorological data. Our optimizer achieved a significant performance with 40% less data compared to adaptive baselines (Adam, RMSProp), highlighting the practical impact of statistically-informed optimization, especially for regions facing data scarcity. This work provides a concrete bridge between statistical theory (data structure, efficiency) and optimization practice (algorithm design, scalability), demonstrating that geometry-aware algorithms can democratize effective ML for resource-limited applications. VL - 10 IS - 3 ER -