| Peer-Reviewed

Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM

Received: 26 September 2022    Accepted: 10 October 2022    Published: 17 October 2022
Views:       Downloads:
Abstract

The global revenue from streaming, CD, and digital music sales have exceeded pre-COVID-19 levels since the COVID-19 outbreak. Although other stocks have fallen, stocks relating to the music industry have risen. HYBE entertainment even yielded integrated platform services. Furthermore, there are many people who make music without an agency and post it on platforms such as Soundcloud. Whether the popular music last week can be predicted to be popular this week using the methods we outlined in this paper. We obtained the dataset from Spotify, the main subscription service. The paper has two objectives: predicting popularity and revealing the relationship between K-means and LGBM since there is a paper claiming that the K-means algorithm is efficient in the Spotify dataset. The experiment yielded that the K-means algorithm is not efficient in our dataset by showing less Silhouette score. However, when combining K-means with LGBM, this approach achieved higher performance compared to using LGBM solely. Even if the experiment’s result is positive, which could assist in determining whether a composer’s songs will be lucrative, we do acknowledge some drawbacks in our methods. For instance, we did not account for the numerous variables introduced by utilizing phony streams to enhance their placement inside the real-time chart. Additionally, we did not include any of the time’s top tunes. Christmas theme music, for instance. Throughout the future, we will conduct additional research into this topic to overcome those drawbacks.

Published in International Journal of Data Science and Analysis (Volume 8, Issue 5)
DOI 10.11648/j.ijdsa.20220805.15
Page(s) 149-156
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Music, Machine Learning, K-means, LGBM

References
[1] BBC News. (2022, February 23). K-pop: BTS agency Hybe grows profits by 31%. Retrieved September 26, 2022, from https://www.bbc.com/news/business-60488623
[2] K-pop’s biggest agencies see rise in profits despite pandemic. (2021, March 30). Retrieved September 26, 2022, from https://koreajoongangdaily.joins.com/2021/03/30/entertainment/kpop/SM-entertainment-JYP-entertainment-YG-entertainment/20210330160916443.html
[3] Music Streaming App Revenue and Usage Statistics (2022). Business of Apps. (2022, September 12). Retrieved September 26, 2022, from https://www.businessofapps.com/data/music-streaming-market/
[4] K-pop's biggest agencies see rise in profits despite pandemic. Korea joongAng Daily. (2021, March 30). Retrieved September 21, 2022, from https://koreajoongangdaily.joins.com/2021/03/30/entertainment/kpop/SM-entertainment-JYP-entertainment-YG-entertainment/20210330160916443.html
[5] Watkins, C. (2022, January 19). How Spotify’s user experience is helping them win the streaming wars. Medium. Retrieved September 26, 2022, from https://uxdesign.cc/ux-ui-analysis-spotify-31f3855a1740
[6] Soares Araujo, C. V., Pinheiro de Cristo, M. A., & Giusti, R. (2019, December). Predicting Music Popularity Using Music Charts. 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA). https://doi.org/10.1109/icmla.2019.00149
[7] Lee, J., & Lee, J. S. (2015). Predicting Music Popularity Patterns based on Musical Complexity and Early Stage Popularity. Proceedings of the Third Edition Workshop on Speech, Language &Amp; Audio in Multimedia - SLAM ’15. https://doi.org/10.1145/2802558.2814645
[8] Lee, J., & Lee, J. S. (2018, November). Music Popularity: Metrics, Characteristics, and Audio-Based Prediction. IEEE Transactions on Multimedia, 20 (11), 3173–3182. https://doi.org/10.1109/tmm.2018.2820903
[9] Martin-Gutierrez, D., Hernandez Penaloza, G., Belmonte-Hernandez, A., & Alvarez Garcia, F. (2020). A Multimodal End-to-End Deep Learning Architecture for Music Popularity Prediction. IEEE Access, 8, 39361–39374. https://doi.org/10.1109/access.2020.2976033
[10] Privandhani, N. A. (2022). Clustering Pop Songs Based On Spotify Data Using K-Means And K-Medoids Algorithm. Jurnal Mantik, 6 (2), 1542-1550.
[11] The Spotify Hit Predictor Dataset (1960-2019). (2020, April 26). Kaggle. Retrieved September 26, 2022, from https://www.kaggle.com/datasets/theoverman/the-spotify-hit-predictor-dataset
[12] Yuan, C., & Yang, H. (2019, June 18). Research on K-Value Selection Method of K-Means Clustering Algorithm. J, 2 (2), 226–235. https://doi.org/10.3390/j2020016
Cite This Article
  • APA Style

    Hyeonsoo Oh. (2022). Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM. International Journal of Data Science and Analysis, 8(5), 149-156. https://doi.org/10.11648/j.ijdsa.20220805.15

    Copy | Download

    ACS Style

    Hyeonsoo Oh. Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM. Int. J. Data Sci. Anal. 2022, 8(5), 149-156. doi: 10.11648/j.ijdsa.20220805.15

    Copy | Download

    AMA Style

    Hyeonsoo Oh. Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM. Int J Data Sci Anal. 2022;8(5):149-156. doi: 10.11648/j.ijdsa.20220805.15

    Copy | Download

  • @article{10.11648/j.ijdsa.20220805.15,
      author = {Hyeonsoo Oh},
      title = {Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM},
      journal = {International Journal of Data Science and Analysis},
      volume = {8},
      number = {5},
      pages = {149-156},
      doi = {10.11648/j.ijdsa.20220805.15},
      url = {https://doi.org/10.11648/j.ijdsa.20220805.15},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdsa.20220805.15},
      abstract = {The global revenue from streaming, CD, and digital music sales have exceeded pre-COVID-19 levels since the COVID-19 outbreak. Although other stocks have fallen, stocks relating to the music industry have risen. HYBE entertainment even yielded integrated platform services. Furthermore, there are many people who make music without an agency and post it on platforms such as Soundcloud. Whether the popular music last week can be predicted to be popular this week using the methods we outlined in this paper. We obtained the dataset from Spotify, the main subscription service. The paper has two objectives: predicting popularity and revealing the relationship between K-means and LGBM since there is a paper claiming that the K-means algorithm is efficient in the Spotify dataset. The experiment yielded that the K-means algorithm is not efficient in our dataset by showing less Silhouette score. However, when combining K-means with LGBM, this approach achieved higher performance compared to using LGBM solely. Even if the experiment’s result is positive, which could assist in determining whether a composer’s songs will be lucrative, we do acknowledge some drawbacks in our methods. For instance, we did not account for the numerous variables introduced by utilizing phony streams to enhance their placement inside the real-time chart. Additionally, we did not include any of the time’s top tunes. Christmas theme music, for instance. Throughout the future, we will conduct additional research into this topic to overcome those drawbacks.},
     year = {2022}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Predicting Music Popularity with the Hybrid Approach: K-Means + LGBM
    AU  - Hyeonsoo Oh
    Y1  - 2022/10/17
    PY  - 2022
    N1  - https://doi.org/10.11648/j.ijdsa.20220805.15
    DO  - 10.11648/j.ijdsa.20220805.15
    T2  - International Journal of Data Science and Analysis
    JF  - International Journal of Data Science and Analysis
    JO  - International Journal of Data Science and Analysis
    SP  - 149
    EP  - 156
    PB  - Science Publishing Group
    SN  - 2575-1891
    UR  - https://doi.org/10.11648/j.ijdsa.20220805.15
    AB  - The global revenue from streaming, CD, and digital music sales have exceeded pre-COVID-19 levels since the COVID-19 outbreak. Although other stocks have fallen, stocks relating to the music industry have risen. HYBE entertainment even yielded integrated platform services. Furthermore, there are many people who make music without an agency and post it on platforms such as Soundcloud. Whether the popular music last week can be predicted to be popular this week using the methods we outlined in this paper. We obtained the dataset from Spotify, the main subscription service. The paper has two objectives: predicting popularity and revealing the relationship between K-means and LGBM since there is a paper claiming that the K-means algorithm is efficient in the Spotify dataset. The experiment yielded that the K-means algorithm is not efficient in our dataset by showing less Silhouette score. However, when combining K-means with LGBM, this approach achieved higher performance compared to using LGBM solely. Even if the experiment’s result is positive, which could assist in determining whether a composer’s songs will be lucrative, we do acknowledge some drawbacks in our methods. For instance, we did not account for the numerous variables introduced by utilizing phony streams to enhance their placement inside the real-time chart. Additionally, we did not include any of the time’s top tunes. Christmas theme music, for instance. Throughout the future, we will conduct additional research into this topic to overcome those drawbacks.
    VL  - 8
    IS  - 5
    ER  - 

    Copy | Download

Author Information
  • Elective Home Education (EHE/Home-schooling), Seoul, Korea

  • Sections