Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm

Muhammad Saleem; Naveed Sheikh; Abdul Rehman; Muhammad Rafiq; Shah Jahan

doi:doi:10.11648/j.mcs.20230805.11

Research Article |

| Peer-Reviewed

Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm

Muhammad Saleem^*, Naveed Sheikh, Abdul Rehman, Muhammad Rafiq, Shah Jahan

Published in Mathematics and Computer Science (Volume 8, Issue 5)

Received: 13 November 2023 Accepted: 1 December 2023 Published: 28 December 2023

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.

Published in	Mathematics and Computer Science (Volume 8, Issue 5)
DOI	10.11648/j.mcs.20230805.11
Page(s)	104-111
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2023. Published by Science Publishing Group

Keywords

Object Detection, Loss Function, Real-Time, YOLOv4

References

[1]	A. Kumar and S. Srivastava, “Object Detection System Based on Convolution Neural Networks Using Single Shot Multi-Box Detector,” Procedia Comput. Sci., vol. 171, no. 2019, pp. 2610–2617, 2020, doi: 10.1016/j.procs.2020.04.283.
[2]	J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.
[3]	J. Tao, H. Wang, X. Zhang, X. Li, and H. Yang, “An object detection system based on YOLO in traffic scene,” Proc. 2017 6th Int. Conf. Comput. Sci. Netw. Technol. ICCSNT 2017, vol. 2018-Janua, pp. 315–319, 2018, doi: 10.1109/ICCSNT.2017.8343709.
[4]	T. Ahmad et al., “Object Detection through Modified YOLO Neural Network,” Sci. Program., vol. 2020, 2020, doi: 10.1155/2020/8403262.
[5]	Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, “Object Detection with Deep Learning: A Review,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 11, pp. 3212–3232, 2019, doi: 10.1109/TNNLS.2018.2876865.
[6]	S. Lu, B. Wang, H. Wang, L. Chen, M. Linjian, and X. Zhang, “A real-time object detection algorithm for video,” Comput. Electr. Eng., vol. 77, pp. 398–408, 2019, doi: 10.1016/j.compeleceng.2019.05.009.
[7]	M. Algabri, H. Mathkour, M. A. Bencherif, M. Alsulaiman, and M. A. Mekhtiche, “Towards Deep Object Detection Techniques for Phoneme Recognition,” IEEE Access, vol. 8, pp. 54663–54680, 2020, doi: 10.1109/ACCESS.2020.2980452.
[8]	P. Sermanet and D. Eigen, “OverFeat : Integrated Recognition, Localization and Detection using Convolutional Networks arXiv : 1312. 6229v4 [cs. CV] 24 Feb 2014”.
[9]	R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi: 10.1109/CVPR.2014.81.
[10]	R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440–1448. doi: 10.1109/ICCV.2015.169.
[11]	S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.
[12]	W. Liu et al., “SSD: Single shot multibox detector,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9905 LNCS, pp. 21–37, 2016, doi: 10.1007/978-3-319-46448-0_2.
[13]	J. Redmon and A. Farhadi, “Yolo V2.0,” Cvpr2017, no. April, pp. 187–213, 2017, [Online]. Available: http://www.worldscientific.com/doi/abs/10.1142/9789812771728_0012
[14]	J. Redmon and A. Farhadi, “YOLO v.3, An incremental improvement” Tech Rep., pp. 1–6, 2018, [Online]. Available: https://pjreddie.com/media/files/papers/YOLOv3.pdf
[15]	A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv. 2020.
[16]	X. Wang and J. Song, “ICIoU: Improved Loss Based on Complete Intersection over Union for Bounding Box Regression,” IEEE Access, vol. 9, pp. 105686–105695, 2021, doi: 10.1109/ACCESS.2021.3100414.
[17]	J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, “UnitBox,” pp. 516–520, 2016, doi: 10.1145/2964284.2967274.
[18]	H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 658–666, 2019, doi: 10.1109/CVPR.2019.00075.
[19]	X. Qian, S. Lin, G. Cheng, X. Yao, H. Ren, and W. Wang, “Object detection in remote sensing images based on improved bounding box regression and multi-level features fusion,” Remote Sens., vol. 12, no. 1, 2020, doi: 10.3390/RS12010143.
[20]	Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU loss: Faster and better learning for bounding box regression,” AAAI 2020 - 34th AAAI Conf. Artif. Intell., no. 2, pp. 12993–13000, 2020, doi: 10.1609/aaai.v34i07.6999.

Cite This Article

Plain Text BibTeX RIS

APA Style

Saleem, M., Sheikh, N., Rehman, A., Rafiq, M., Jahan, S. (2023). Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Mathematics and Computer Science, 8(5), 104-111. https://doi.org/10.11648/j.mcs.20230805.11

Copy | Download

ACS Style

Saleem, M.; Sheikh, N.; Rehman, A.; Rafiq, M.; Jahan, S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math. Comput. Sci. 2023, 8(5), 104-111. doi: 10.11648/j.mcs.20230805.11

Copy | Download

AMA Style

Saleem M, Sheikh N, Rehman A, Rafiq M, Jahan S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math Comput Sci. 2023;8(5):104-111. doi: 10.11648/j.mcs.20230805.11

Copy | Download

@article{10.11648/j.mcs.20230805.11,
  author = {Muhammad Saleem and Naveed Sheikh and Abdul Rehman and Muhammad Rafiq and Shah Jahan},
  title = {Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm},
  journal = {Mathematics and Computer Science},
  volume = {8},
  number = {5},
  pages = {104-111},
  doi = {10.11648/j.mcs.20230805.11},
  url = {https://doi.org/10.11648/j.mcs.20230805.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mcs.20230805.11},
  abstract = {A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.
},
 year = {2023}
}

Copy | Download

TY  - JOUR
T1  - Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm
AU  - Muhammad Saleem
AU  - Naveed Sheikh
AU  - Abdul Rehman
AU  - Muhammad Rafiq
AU  - Shah Jahan
Y1  - 2023/12/28
PY  - 2023
N1  - https://doi.org/10.11648/j.mcs.20230805.11
DO  - 10.11648/j.mcs.20230805.11
T2  - Mathematics and Computer Science
JF  - Mathematics and Computer Science
JO  - Mathematics and Computer Science
SP  - 104
EP  - 111
PB  - Science Publishing Group
SN  - 2575-6028
UR  - https://doi.org/10.11648/j.mcs.20230805.11
AB  - A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.

VL  - 8
IS  - 5
ER  -

Copy | Download

Author Information

Muhammad Saleem

Department of Mathematics, University of Balochistan, Quetta, Pakistan

Contact Email
Naveed Sheikh

Department of Mathematics, University of Balochistan, Quetta, Pakistan

Contact Email
Abdul Rehman

Department of Mathematics, University of Balochistan, Quetta, Pakistan

Contact Email
Muhammad Rafiq

Department of Mathematics, University of Balochistan, Quetta, Pakistan

Contact Email
Shah Jahan

Department of Mathematics, University of Balochistan, Quetta, Pakistan

Contact Email

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Saleem, M., Sheikh, N., Rehman, A., Rafiq, M., Jahan, S. (2023). Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Mathematics and Computer Science, 8(5), 104-111. https://doi.org/10.11648/j.mcs.20230805.11

Copy | Download

ACS Style

Saleem, M.; Sheikh, N.; Rehman, A.; Rafiq, M.; Jahan, S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math. Comput. Sci. 2023, 8(5), 104-111. doi: 10.11648/j.mcs.20230805.11

Copy | Download

AMA Style

Saleem M, Sheikh N, Rehman A, Rafiq M, Jahan S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math Comput Sci. 2023;8(5):104-111. doi: 10.11648/j.mcs.20230805.11

Copy | Download

@article{10.11648/j.mcs.20230805.11,
  author = {Muhammad Saleem and Naveed Sheikh and Abdul Rehman and Muhammad Rafiq and Shah Jahan},
  title = {Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm},
  journal = {Mathematics and Computer Science},
  volume = {8},
  number = {5},
  pages = {104-111},
  doi = {10.11648/j.mcs.20230805.11},
  url = {https://doi.org/10.11648/j.mcs.20230805.11},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mcs.20230805.11},
  abstract = {A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.
},
 year = {2023}
}

Copy | Download

TY  - JOUR
T1  - Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm
AU  - Muhammad Saleem
AU  - Naveed Sheikh
AU  - Abdul Rehman
AU  - Muhammad Rafiq
AU  - Shah Jahan
Y1  - 2023/12/28
PY  - 2023
N1  - https://doi.org/10.11648/j.mcs.20230805.11
DO  - 10.11648/j.mcs.20230805.11
T2  - Mathematics and Computer Science
JF  - Mathematics and Computer Science
JO  - Mathematics and Computer Science
SP  - 104
EP  - 111
PB  - Science Publishing Group
SN  - 2575-6028
UR  - https://doi.org/10.11648/j.mcs.20230805.11
AB  - A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.

VL  - 8
IS  - 5
ER  -

Copy | Download