Intelligent manufacturing relies heavily on industrial vision, and visual algorithms are rapidly being applied in the industry. However, industrial controllers are primarily used for logic control with deterministic execution cycles, and the uncertainty of vision code execution time strongly correlated with input affects their stability. To adjust the scanning cycle of the system in time to ensure system stability, an algorithm that can predict the time required for the vision code to process the target image is needed. In this paper, we analyze the weakness of traditional convolutional neural network models (CNN) and propose a multi-level and multi-scale CNN model (MLMS-CNN) for vision code execution time prediction. Instead of typical convolutional layers, we design an architecture to collect multi-scale features from the input feature maps. Moreover, a hierarchical structure is designed to reduce the loss of intermediate feature utilization by fusing features from different abstraction levels. We extract image features from images and runtime features from vision code blocks, then compare MLMS-CNN to six standard regression models, all of which are trained with the extracted features as input and the actual execution results of the visual code as output. The experimental results show that our model achieves better performance and stability.
Published in | Engineering and Applied Sciences (Volume 7, Issue 6) |
DOI | 10.11648/j.eas.20220706.13 |
Page(s) | 93-99 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2022. Published by Science Publishing Group |
Deep Learning, Performance Prediction, Vision Code
[1] | Lim D, Kim Y G, Park T H. SMD classification for automated optical inspection machine using convolution neural network [C]//2019 Third IEEE International Conference on Robotic Computing (IRC). IEEE, 2019: 395-398. |
[2] | Luo, Q., Sun, Y., Li, P., Simpson, O., Tian, L., & He, Y. (2018). Generalized completed local binary patterns for time-efficient steel surface defect classification. IEEE Transactions on Instrumentation and Measurement, 68 (3), 667-679. |
[3] | Abbood W T, Abdullah O I, Khalid E A. A real-time automated sorting of robotic vision system based on the interactive design approach [J]. International Journal on Interactive Design and Manufacturing (IJIDeM), 2020, 14 (1): 201-209. |
[4] | Barker K J, Pakin S, Kerbyson D J. A performance model of the krak hydrodynamics application [C]//2006 International Conference on Parallel Processing (ICPP'06). IEEE, 2006: 245-254. |
[5] | Kerbyson D J, Alme H J, Hoisie A, et al. Predictive performance and scalability modeling of a large-scale application [C]//Proceedings of the 2001 ACM/IEEE conference on Supercomputing. 2001: 37-37. |
[6] | Hao M, Zhang W, Zhang Y, et al. Automatic generation of benchmarks for I/O-intensive parallel applications [J]. Journal of Parallel and Distributed Computing, 2019, 124: 1-13. |
[7] | Sodhi S, Subhlok J, Xu Q. Performance prediction with skeletons [J]. Cluster Computing, 2008, 11 (2): 151-165. |
[8] | Huang L, Jia J, Yu B, et al. Predicting execution time of computer programs using sparse polynomial regression [J]. Advances in neural information processing systems, 2010, 23. |
[9] | Adams A, Ma K, Anderson L, et al. Learning to optimize halide with tree search and random programs [J]. ACM Transactions on Graphics (TOG), 2019, 38 (4): 1-12. |
[10] | Wang X, Han T X, Yan S. An HOG-LBP human detector with partial occlusion handling [C]//2009 IEEE 12th international conference on computer vision. IEEE, 2009: 32-39. |
[11] | Ojala T, Pietikainen M, Harwood D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions [C]//Proceedings of 12th international conference on pattern recognition. IEEE, 1994, 1: 582-585. |
[12] | Papageorgiou C P, Oren M, Poggio T. A general framework for object detection [C]//Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271). IEEE, 1998: 555-562. |
[13] | Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition [J]. arXiv preprint arXiv: 1409.1556, 2014. |
[14] | He K, Zhang X, Ren S, et al. Deep residual learning for image recognition [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. |
[15] | Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708. |
[16] | Altenbernd P, Gustafsson J, Lisper B, et al. Early execution time-estimation through automatically generated timing models [J]. Real-Time Systems, 2016, 52 (6): 731-760. |
[17] | Van den Steen S, Eyerman S, De Pestel S, et al. Analytical processor performance and power modeling using micro-architecture independent characteristics [J]. IEEE Transactions on Computers, 2016, 65 (12): 3537-3551. |
[18] | Jongerius R, Anghel A, Dittmann G, et al. Analytic multi-core processor model for fast design-space exploration [J]. IEEE Transactions on Computers, 2017, 67 (6): 755-770. |
[19] | Zhang W, Cheng A M K, Subhlok J. Dwarfcode: a performance prediction tool for parallel applications [J]. IEEE Transactions on Computers, 2015, 65 (2): 495-507. |
[20] | Sieh V, Burlacu R, Hönig T, et al. Combining Automated Measurement-Based Cost Modeling With Static Worst-Case Execution-Time and Energy-Consumption Analyses [J]. IEEE Embedded Systems Letters, 2018, 11 (2): 38-41. |
[21] | Aaziz O, Cook J, Cook J, et al. A methodology for characterizing the correspondence between real and proxy applications [C]//2018 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2018: 190-200. |
[22] | Pham T P, Durillo J J, Fahringer T. Predicting workflow task execution time in the cloud using a two-stage machine learning approach [J]. IEEE Transactions on Cloud Computing, 2017, 8 (1): 256-268. |
[23] | Sun J, Sun G, Zhan S, et al. Automated performance modeling of HPC applications using machine learning [J]. IEEE Transactions on Computers, 2020, 69 (5): 749-763. |
[24] | Singh A, Purawat S, Rao A, et al. Modular performance prediction for scientific workflows using Machine Learning [J]. Future Generation Computer Systems, 2021, 114: 1-14. |
[25] | Ke, Q., Liu, J., Bennamoun, M., An, S., Sohel, F., & Boussaid, F. (2018). Computer Vision for Human-Machine Interaction. In L. Marco, & G. M. Farinella (Eds.), Computer Vision for Assistive Healthcare (pp. 127-145). Academic Press. |
[26] | Kattenborn T, Leitloff J, Schiefer F, et al. Review on Convolutional Neural Networks (CNN) in vegetation remote sensing [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 173: 24-49. |
[27] | Hair J F, Sarstedt M, Ringle C M. Rethinking some of the rethinking of partial least squares [J]. European Journal of Marketing, 2019. |
[28] | Fan P, Deng R, Qiu J, et al. Well logging curve reconstruction based on kernel ridge regression [J]. Arabian Journal of Geosciences, 2021, 14 (16): 1-10. |
[29] | Shahraki A, Abbasi M, Haugen Ø. Boosting algorithms for network intrusion detection: A comparative evaluation of Real AdaBoost, Gentle AdaBoost and Modest AdaBoost [J]. Engineering Applications of Artificial Intelligence, 2020, 94: 103770. |
[30] | Yung L S, Yang C, Wan X, et al. GBOOST: a GPU-based tool for detecting gene–gene interactions in genome–wide case control studies [J]. Bioinformatics, 2011, 27 (9): 1309-1310. |
[31] | Cai L, Yu Y, Zhang S, et al. A sample-rebalanced outlier-rejected k-nearest neighbor regression model for short-term traffic flow forecasting [J]. IEEE access, 2020, 8: 22686-22696. |
[32] | Sharifzadeh M, Sikinioti-Lock A, Shah N. Machine-learning methods for integrated renewable power generation: A comparative study of artificial neural networks, support vector regression, and Gaussian Process Regression [J]. Renewable and Sustainable Energy Reviews, 2019, 108: 513-538. |
[33] | Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2818-2826. |
[34] | Sundaram-Stukel D, Vernon M K. Predictive analysis of a wavefront application using LogGP [C]//Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming. 1999: 141-150. |
[35] | Lu G, Zhang W, He H, et al. Performance modeling for mpi applications with low overhead fine-grained profiling [J]. Future Generation Computer Systems, 2019, 90: 317-326. |
APA Style
Fule Ji, Yanlong Xi. (2022). Vision Code Execution Time Prediction Based on Multi-level and Multi-scale CNN. Engineering and Applied Sciences, 7(6), 93-99. https://doi.org/10.11648/j.eas.20220706.13
ACS Style
Fule Ji; Yanlong Xi. Vision Code Execution Time Prediction Based on Multi-level and Multi-scale CNN. Eng. Appl. Sci. 2022, 7(6), 93-99. doi: 10.11648/j.eas.20220706.13
@article{10.11648/j.eas.20220706.13, author = {Fule Ji and Yanlong Xi}, title = {Vision Code Execution Time Prediction Based on Multi-level and Multi-scale CNN}, journal = {Engineering and Applied Sciences}, volume = {7}, number = {6}, pages = {93-99}, doi = {10.11648/j.eas.20220706.13}, url = {https://doi.org/10.11648/j.eas.20220706.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.eas.20220706.13}, abstract = {Intelligent manufacturing relies heavily on industrial vision, and visual algorithms are rapidly being applied in the industry. However, industrial controllers are primarily used for logic control with deterministic execution cycles, and the uncertainty of vision code execution time strongly correlated with input affects their stability. To adjust the scanning cycle of the system in time to ensure system stability, an algorithm that can predict the time required for the vision code to process the target image is needed. In this paper, we analyze the weakness of traditional convolutional neural network models (CNN) and propose a multi-level and multi-scale CNN model (MLMS-CNN) for vision code execution time prediction. Instead of typical convolutional layers, we design an architecture to collect multi-scale features from the input feature maps. Moreover, a hierarchical structure is designed to reduce the loss of intermediate feature utilization by fusing features from different abstraction levels. We extract image features from images and runtime features from vision code blocks, then compare MLMS-CNN to six standard regression models, all of which are trained with the extracted features as input and the actual execution results of the visual code as output. The experimental results show that our model achieves better performance and stability.}, year = {2022} }
TY - JOUR T1 - Vision Code Execution Time Prediction Based on Multi-level and Multi-scale CNN AU - Fule Ji AU - Yanlong Xi Y1 - 2022/12/15 PY - 2022 N1 - https://doi.org/10.11648/j.eas.20220706.13 DO - 10.11648/j.eas.20220706.13 T2 - Engineering and Applied Sciences JF - Engineering and Applied Sciences JO - Engineering and Applied Sciences SP - 93 EP - 99 PB - Science Publishing Group SN - 2575-1468 UR - https://doi.org/10.11648/j.eas.20220706.13 AB - Intelligent manufacturing relies heavily on industrial vision, and visual algorithms are rapidly being applied in the industry. However, industrial controllers are primarily used for logic control with deterministic execution cycles, and the uncertainty of vision code execution time strongly correlated with input affects their stability. To adjust the scanning cycle of the system in time to ensure system stability, an algorithm that can predict the time required for the vision code to process the target image is needed. In this paper, we analyze the weakness of traditional convolutional neural network models (CNN) and propose a multi-level and multi-scale CNN model (MLMS-CNN) for vision code execution time prediction. Instead of typical convolutional layers, we design an architecture to collect multi-scale features from the input feature maps. Moreover, a hierarchical structure is designed to reduce the loss of intermediate feature utilization by fusing features from different abstraction levels. We extract image features from images and runtime features from vision code blocks, then compare MLMS-CNN to six standard regression models, all of which are trained with the extracted features as input and the actual execution results of the visual code as output. The experimental results show that our model achieves better performance and stability. VL - 7 IS - 6 ER -