Research Article
BibTex RIS Cite

Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning

Year 2024, Volume: 27 Issue: 2, 479 - 488, 27.03.2024

Abstract

Instagram is a social media platform that allows users to share content such as photos and videos. Fake and bot account problems constitute a significant obstacle to social networking. Since fake and bot accounts have purposes such as increasing the number of followers, creating a perception by using misinformation, deceiving people, detecting these fake and bot accounts plays an essential role in creating a secure social network. Fake account detection is beneficial to keeping people safe from misinformation and malicious profiles on Instagram, ensuring customers' safe accounts, and preventing fraud. From this point, we aim to classify Instagram user profiles into fake, bot, and real accounts with classification algorithms. Additionally, we present a publicly available dataset for the fake, bot, and real accounts detection on Instagram. For data collection, real accounts were determined from our circle of friends, fake accounts were accessed by manual scanning from Instagram, and bot accounts were accessed by purchasing from bot account websites and mobile applications. These accounts' features were collected via web scraping. We use the seven classifiers to train classification models in fake, bot, and real profile detection. Our results show that the Random Forest gives the highest prediction accuracy with 90.2%.

References

  • [1] Pratama R. P. and Tjahyanto A., “The influence of fake accounts on sentiment analysis related to COVID-19 in Indonesia”, Procedia Computer Science, 197, 143-150, (2022).
  • [2] Mourad A., Srour A., Harmanani H., Jenainati C. and Arafeh M., "Critical Impact of Social Networks Infodemic on Defeating Coronavirus COVID-19 Pandemic: Twitter-Based Study and Research Directions," IEEE Transactions on Network and Service Management, 17 (4),2145- 2155, (2020).
  • [3] Zhang M., Chen Z., Qi X., and Liu J., “Could Social Bots’ Sentiment Engagement Shape Humans’ Sentiment on COVID-19 Vaccine Discussion on Twitter?”, Sustainability, 14 (9), 5566, (2022).
  • [4] Aral S., and Eckles D., “Protecting elections from social media manipulation”, Science, 365 (6456),858-861, (2019).
  • [5] S. Lee, L. Qiu and A. Whinston, "Sentiment manipulation in online platforms: An analysis of movie tweets", Prod. Oper. Manage., 27 (3),393-416, (2018).
  • [6] Khaund T., Al-Khateeb S., Tokdemir S. and Agarwal N., “Analyzing Social Bots and Their Coordination During Natural Disasters” In: Thomson, R., Dancy, C., Hyder, A., Bisgin, H. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2018. Lecture Notes in Computer Science, 10899. Springer, Cham, (2018).
  • [7] Kim S., Han J.,“Detecting Engagement Bots on Social Influencer Marketing”, In: , et al. Social Informatics. SocInfo 2020. Lecture Notes in Computer Science, 12467, Springer, Cham, (2020).
  • [8] D. Ramalingam and V. Chinnaiah, “Fake profile detection techniques in large-scale online social networks: A comprehensive review”, Computers & Electrical Engineering, 65, 165-177,(2018).
  • [9] O. Çıtlak, M. Dörterler and İ. Dogru, “A Hybrid Spam Detection Framework for Social Networks”, Politeknik Dergisi, 1-1. (2022).
  • [10] Yang K. C., Varol O., Hui P. M., and Menczer, F, “Scalable and generalizable social bot detection through data selection” In Proceedings of the AAAI conference on artificial intelligence, 34: 1096-1103.(2020).
  • [11] E. Van Der Walt and J. Eloff, "Using Machine Learning to Detect Fake Identities: Bots vs Humans," IEEE Access, 6, 6540-6549, (2018).
  • [12] Purba K. R., Asirvatham D., and Murugesan R. K., “Classification of Instagram fake users using supervised machine learning algorithms”, International Journal of Electrical and Computer Engineering, 10-2763,(2020).
  • [13] S. Cresci, R. D. Pietro, M. Petrocchi, A. Spognardi and M. Tesconi, "Social Fingerprinting: Detection of Spambot Groups Through DNA- Inspired Behavioral Modeling," IEEE Transactions on Dependable and Secure Computing,15: 561-576, (2018).
  • [14] M. Mendini, P. C. Peter, and S Maione, “The potential positive effects of time spent on Instagram on consumers’ gratitude, altruism, and willingness to donate”, Journal of Business Research, 143, 16-26,(2022).
  • [15] N.K.M Douglas, M., Scholz, M. A. Myers et al., “Reviewing the Role of Instagram in Education: Can a Photo Sharing Application Deliver Benefits to Medical and Dental Anatomy Education?”, Med.Sci.Educ., 29, 1117–1128, (2019).
  • [16] R. N. Rasyiid,M. Maulina, C. P. Resueňo, R. Nasrullah, and T. I. Rusli,). “Instagram Usage in Learning English: A Literature Review”. Tell: Teaching of English Language and Literature Journal, 9(2), 133-146. (2021).
  • [17] S. Alhabash, and M. Ma, “A tale of four platforms: Motivations and uses of Facebook, Twitter, Instagram, and Snapchat among college students?”, Social media+ society, 3(1), 2056305117691544. (2017).
  • [18] P. M. Massey, M. D. Kearney, M. Hauer, P. Selvan,E. Koku, A. E. Leader, “A Dimensions of Misinformation About the HPV Vaccine on Instagram: Content and Network Analysis of Social Media Characteristics”, J Med Internet Res, 22(12), e2145, (2022).
  • [19] D. Amanatidis, I. Mylona, I. E. Kamenidou, S. Mamalis, & A.Stavrianea, “Mining textual and imagery instagram data during the COVID-19 pandemic”, Applied Sciences, 11(9), 4281, (2021).
  • [20] F. Niknam, M. Samadbeik,F. Fatehi, M. Shirdel, M. Rezazadeh & P. Bastani, “COVID-19 on Instagram: A content analysis of selected accounts”, Health policy and technology, 10(1), 165–173, (2021).
  • [21] FC. H. Basch & S. A. MacLean, “Breast cancer on Instagram: a descriptive study”, International Journal of Preventive Medicine, 10, 166, (2019).
  • [22] FC. H. Basch & S. A. MacLean, “Colorectal cancer on Instagram: a content analysis”, Journal of Consumer Health on the Internet, 23(4), 378-383, (2019).
  • [23] A. J.Vassallo, B. Kelly, L. Zhang, Z. Wang, S. Young, & B. Freeman, “Junk food marketing on Instagram: content analysis”, JMIR public health and surveillance, 4(2), e9594, (2018).
  • [24] J. Haßler, A. S. Kümpel & J. Keller, “Instagram and political campaigning in the 2017 German federal election. A quantitative content analysis of German top politicians’ and parliamentary parties’ posts”, Information, Communication & Society, 1-21, (2021).
  • [25] T.K. Hua, “Stop Cyberbullying”, Government publications—Malaysia, (2019).
  • [26] www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users, Most popular social networks worldwide as of January 2022, ranked by number of monthly active users (in millions), (2022).
  • [27] F. C. Akyon and M. Esat Kalfaoglu, "Instagram Fake and Automated Account Detection," 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), 1-7, (2019).
  • [28] https://github.com/fcakyon/instafake-dataset/tree/master/data, Github. Instafake dataset, (2022).
  • [29] S. Sheikhi, “An efficient method for detection of fake accounts on the Instagram platform”. Revue d'Intelligence Artificielle, 34:429- 436, (2020).
  • [30] M. J. Ekosputra, A. Susanto, F. Haryanto and D. Suhartono, "Supervised Machine Learning Algorithms to Detect Instagram Fake Accounts," 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 396-400,(2021).
  • [31] https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuineaccounts, Kaggle. Instagram Fake Spammer Genuine Accounts (2022).
  • [32] K. Anklesaria, Z. Desai, V. Kulkarni and H. Balasubramaniam, "A Survey on Machine Learning Algorithms for Detecting Fake Instagram Accounts," 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), 141-144, (2021).
  • [33] C. Cortes and V. Vapnik “Support-vector networks”, Machine learning, 20(3), 273-297, (1995).
  • [34] L. Breiman, “Random forests”, Machine learning, 45(1), 5-32, (2001).
  • [35] E. Kuşkapan and M. Y. Çodur, “Trafik Kazalarının Sınıflandırılmasında Çok Katmanlı Algılayıcı, Regresyon ve En Yakın Komşuluk Algoritmalarının Performans Analizi”, Politeknik Dergisi, 25 (1), 373-380. (2022).
  • [36] I.N. da Silva, D. Hernane Spatti, R. Andrade Flauzino,L. H. B. Liboni, S.F. dos Reis Alves . Multilayer Perceptron Networks. In: Artificial Neural Networks, Springer, Cham, (2017).
  • [37] Ö. F. Arar, K. Ayan, “A feature dependent Naive Bayes approach and its application to the software defect prediction problem”,Applied Soft Computing, 59, 197-209, (2017).

Makine Öğrenmesi ile Instagram'da Sahte, Bot ve Gerçek Hesapların Sınıflandırılması

Year 2024, Volume: 27 Issue: 2, 479 - 488, 27.03.2024

Abstract

Instagram, kullanıcıların fotoğraf ve video gibi içerikleri paylaşmalarını sağlayan bir sosyal medya platformudur. Sahte ve bot hesap sorunları sosyal ağların önünde önemli bir engel oluşturmaktadır. Sahte ve bot hesapların takipçi sayısını artırmak, yanlış bilgiler kullanarak algı oluşturmak, insanları aldatmak, bu sahte ve bot hesapları tespit etmek gibi amaçları olduğundan güvenli bir sosyal ağ oluşturmada önemli rol oynar. Sahte hesap tespiti, insanları Instagram'daki yanlış bilgilerden ve kötü niyetli profillerden korumak, müşterilerin hesaplarının güvenliğini sağlamak ve dolandırıcılığı önlemek için faydalıdır. Bu noktadan hareketle, bu çalışma ile Instagram kullanıcı profillerini sınıflandırma algoritmaları ile fake, bot ve gerçek hesaplar olarak sınıflandırmayı amaçlanmaktadır. Ek olarak, Instagram'da sahte, bot ve gerçek hesap tespiti için herkese açık bir veri seti sunulmaktadır. Veri toplama aşamasında, gerçek hesaplar arkadaş çevremizden, sahte hesaplar Instagram paylaşımları manuel taranarak belirlenirken, bot hesaplara bot hesap siteleri ve mobil uygulamalardan satın alma işlemi ile ulaşılmıştır. Bu hesaplara ait öznitelikler ise web kazıma yoluyla toplanmıştır. Sahte, bot ve gerçek profil algılamada sınıflandırma modellerini eğitmek için yedi adet sınıflandırıcı kullanılmıştır. Sonuçlar, Rasgele Orman Sınıflandırıcısının %90,2 ile en yüksek tahmin doğruluğunu verdiğini göstermiştir.

References

  • [1] Pratama R. P. and Tjahyanto A., “The influence of fake accounts on sentiment analysis related to COVID-19 in Indonesia”, Procedia Computer Science, 197, 143-150, (2022).
  • [2] Mourad A., Srour A., Harmanani H., Jenainati C. and Arafeh M., "Critical Impact of Social Networks Infodemic on Defeating Coronavirus COVID-19 Pandemic: Twitter-Based Study and Research Directions," IEEE Transactions on Network and Service Management, 17 (4),2145- 2155, (2020).
  • [3] Zhang M., Chen Z., Qi X., and Liu J., “Could Social Bots’ Sentiment Engagement Shape Humans’ Sentiment on COVID-19 Vaccine Discussion on Twitter?”, Sustainability, 14 (9), 5566, (2022).
  • [4] Aral S., and Eckles D., “Protecting elections from social media manipulation”, Science, 365 (6456),858-861, (2019).
  • [5] S. Lee, L. Qiu and A. Whinston, "Sentiment manipulation in online platforms: An analysis of movie tweets", Prod. Oper. Manage., 27 (3),393-416, (2018).
  • [6] Khaund T., Al-Khateeb S., Tokdemir S. and Agarwal N., “Analyzing Social Bots and Their Coordination During Natural Disasters” In: Thomson, R., Dancy, C., Hyder, A., Bisgin, H. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2018. Lecture Notes in Computer Science, 10899. Springer, Cham, (2018).
  • [7] Kim S., Han J.,“Detecting Engagement Bots on Social Influencer Marketing”, In: , et al. Social Informatics. SocInfo 2020. Lecture Notes in Computer Science, 12467, Springer, Cham, (2020).
  • [8] D. Ramalingam and V. Chinnaiah, “Fake profile detection techniques in large-scale online social networks: A comprehensive review”, Computers & Electrical Engineering, 65, 165-177,(2018).
  • [9] O. Çıtlak, M. Dörterler and İ. Dogru, “A Hybrid Spam Detection Framework for Social Networks”, Politeknik Dergisi, 1-1. (2022).
  • [10] Yang K. C., Varol O., Hui P. M., and Menczer, F, “Scalable and generalizable social bot detection through data selection” In Proceedings of the AAAI conference on artificial intelligence, 34: 1096-1103.(2020).
  • [11] E. Van Der Walt and J. Eloff, "Using Machine Learning to Detect Fake Identities: Bots vs Humans," IEEE Access, 6, 6540-6549, (2018).
  • [12] Purba K. R., Asirvatham D., and Murugesan R. K., “Classification of Instagram fake users using supervised machine learning algorithms”, International Journal of Electrical and Computer Engineering, 10-2763,(2020).
  • [13] S. Cresci, R. D. Pietro, M. Petrocchi, A. Spognardi and M. Tesconi, "Social Fingerprinting: Detection of Spambot Groups Through DNA- Inspired Behavioral Modeling," IEEE Transactions on Dependable and Secure Computing,15: 561-576, (2018).
  • [14] M. Mendini, P. C. Peter, and S Maione, “The potential positive effects of time spent on Instagram on consumers’ gratitude, altruism, and willingness to donate”, Journal of Business Research, 143, 16-26,(2022).
  • [15] N.K.M Douglas, M., Scholz, M. A. Myers et al., “Reviewing the Role of Instagram in Education: Can a Photo Sharing Application Deliver Benefits to Medical and Dental Anatomy Education?”, Med.Sci.Educ., 29, 1117–1128, (2019).
  • [16] R. N. Rasyiid,M. Maulina, C. P. Resueňo, R. Nasrullah, and T. I. Rusli,). “Instagram Usage in Learning English: A Literature Review”. Tell: Teaching of English Language and Literature Journal, 9(2), 133-146. (2021).
  • [17] S. Alhabash, and M. Ma, “A tale of four platforms: Motivations and uses of Facebook, Twitter, Instagram, and Snapchat among college students?”, Social media+ society, 3(1), 2056305117691544. (2017).
  • [18] P. M. Massey, M. D. Kearney, M. Hauer, P. Selvan,E. Koku, A. E. Leader, “A Dimensions of Misinformation About the HPV Vaccine on Instagram: Content and Network Analysis of Social Media Characteristics”, J Med Internet Res, 22(12), e2145, (2022).
  • [19] D. Amanatidis, I. Mylona, I. E. Kamenidou, S. Mamalis, & A.Stavrianea, “Mining textual and imagery instagram data during the COVID-19 pandemic”, Applied Sciences, 11(9), 4281, (2021).
  • [20] F. Niknam, M. Samadbeik,F. Fatehi, M. Shirdel, M. Rezazadeh & P. Bastani, “COVID-19 on Instagram: A content analysis of selected accounts”, Health policy and technology, 10(1), 165–173, (2021).
  • [21] FC. H. Basch & S. A. MacLean, “Breast cancer on Instagram: a descriptive study”, International Journal of Preventive Medicine, 10, 166, (2019).
  • [22] FC. H. Basch & S. A. MacLean, “Colorectal cancer on Instagram: a content analysis”, Journal of Consumer Health on the Internet, 23(4), 378-383, (2019).
  • [23] A. J.Vassallo, B. Kelly, L. Zhang, Z. Wang, S. Young, & B. Freeman, “Junk food marketing on Instagram: content analysis”, JMIR public health and surveillance, 4(2), e9594, (2018).
  • [24] J. Haßler, A. S. Kümpel & J. Keller, “Instagram and political campaigning in the 2017 German federal election. A quantitative content analysis of German top politicians’ and parliamentary parties’ posts”, Information, Communication & Society, 1-21, (2021).
  • [25] T.K. Hua, “Stop Cyberbullying”, Government publications—Malaysia, (2019).
  • [26] www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users, Most popular social networks worldwide as of January 2022, ranked by number of monthly active users (in millions), (2022).
  • [27] F. C. Akyon and M. Esat Kalfaoglu, "Instagram Fake and Automated Account Detection," 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), 1-7, (2019).
  • [28] https://github.com/fcakyon/instafake-dataset/tree/master/data, Github. Instafake dataset, (2022).
  • [29] S. Sheikhi, “An efficient method for detection of fake accounts on the Instagram platform”. Revue d'Intelligence Artificielle, 34:429- 436, (2020).
  • [30] M. J. Ekosputra, A. Susanto, F. Haryanto and D. Suhartono, "Supervised Machine Learning Algorithms to Detect Instagram Fake Accounts," 2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 396-400,(2021).
  • [31] https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuineaccounts, Kaggle. Instagram Fake Spammer Genuine Accounts (2022).
  • [32] K. Anklesaria, Z. Desai, V. Kulkarni and H. Balasubramaniam, "A Survey on Machine Learning Algorithms for Detecting Fake Instagram Accounts," 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), 141-144, (2021).
  • [33] C. Cortes and V. Vapnik “Support-vector networks”, Machine learning, 20(3), 273-297, (1995).
  • [34] L. Breiman, “Random forests”, Machine learning, 45(1), 5-32, (2001).
  • [35] E. Kuşkapan and M. Y. Çodur, “Trafik Kazalarının Sınıflandırılmasında Çok Katmanlı Algılayıcı, Regresyon ve En Yakın Komşuluk Algoritmalarının Performans Analizi”, Politeknik Dergisi, 25 (1), 373-380. (2022).
  • [36] I.N. da Silva, D. Hernane Spatti, R. Andrade Flauzino,L. H. B. Liboni, S.F. dos Reis Alves . Multilayer Perceptron Networks. In: Artificial Neural Networks, Springer, Cham, (2017).
  • [37] Ö. F. Arar, K. Ayan, “A feature dependent Naive Bayes approach and its application to the software defect prediction problem”,Applied Soft Computing, 59, 197-209, (2017).
There are 37 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Research Article
Authors

Ümmü Tunç 0000-0002-9022-9698

Esra Atalar 0000-0002-1856-9487

Musa Sezer Gargı 0000-0001-7817-3997

Zeliha Ergül Aydın 0000-0002-7108-8930

Publication Date March 27, 2024
Submission Date June 27, 2022
Published in Issue Year 2024 Volume: 27 Issue: 2

Cite

APA Tunç, Ü., Atalar, E., Gargı, M. S., Ergül Aydın, Z. (2024). Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning. Politeknik Dergisi, 27(2), 479-488. https://doi.org/10.2339/politeknik.1136226
AMA Tunç Ü, Atalar E, Gargı MS, Ergül Aydın Z. Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning. Politeknik Dergisi. March 2024;27(2):479-488. doi:10.2339/politeknik.1136226
Chicago Tunç, Ümmü, Esra Atalar, Musa Sezer Gargı, and Zeliha Ergül Aydın. “Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning”. Politeknik Dergisi 27, no. 2 (March 2024): 479-88. https://doi.org/10.2339/politeknik.1136226.
EndNote Tunç Ü, Atalar E, Gargı MS, Ergül Aydın Z (March 1, 2024) Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning. Politeknik Dergisi 27 2 479–488.
IEEE Ü. Tunç, E. Atalar, M. S. Gargı, and Z. Ergül Aydın, “Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning”, Politeknik Dergisi, vol. 27, no. 2, pp. 479–488, 2024, doi: 10.2339/politeknik.1136226.
ISNAD Tunç, Ümmü et al. “Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning”. Politeknik Dergisi 27/2 (March 2024), 479-488. https://doi.org/10.2339/politeknik.1136226.
JAMA Tunç Ü, Atalar E, Gargı MS, Ergül Aydın Z. Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning. Politeknik Dergisi. 2024;27:479–488.
MLA Tunç, Ümmü et al. “Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning”. Politeknik Dergisi, vol. 27, no. 2, 2024, pp. 479-88, doi:10.2339/politeknik.1136226.
Vancouver Tunç Ü, Atalar E, Gargı MS, Ergül Aydın Z. Classification of Fake, Bot, and Real Accounts on Instagram Using Machine Learning. Politeknik Dergisi. 2024;27(2):479-88.