Abstract: A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.
Abstract: A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy,...Show More