Application of Machine Learning Methods in the Task of Identifying User Accounts in Two Social Networks
Abstract
The article describes the approach to solving the problem of comparing user profiles of different social networks and identifying those that belong to one person. An appropriate method is proposed based on a comparison of the social environment and the values of account profile attributes in two different social networks. The results of applying various machine learning models to solving this problem are compared. The novelty of the approach lies in the proposed new combination of various methods and application to
new social networks. The practical significance of the study is to automate the process of determining the ownership of profiles in various social networks to one user. These results can be applied in the task of constructing a meta-profile of a user of an information system for the subsequent construction of a profile of his vulnerabilities, as well as in other studies devoted to social networks.
References
L. Cassy, IBM Study Shows Data Breach Costs on the Rise; Financial Impact Felt for Years.[Online]. Available: https://newsroom.ibm.com/2019-07-23-IBM-Study-Shows-Data-Breach-Costs-onthe-Rise-Financial-Impact-Felt-for-Years
C. Hajruk, According to the analytical center of the company, the volume of information leaks in 2017 increased. [Online]. Available: https://www.infowatch.ru/company/presscenter/news/20235
2019 Healthcare Threat Report Protecting Patients, Providers and Payers. [Online]. Available: https://www.proofpoint.com/us/resources/threat-reports/healthcare-threat-report
A. A. Azarov, T. V. Tulupyeva, A. V. Suvorova, A. L. Tulupyev, M. V. Abramov, and R. M. Jusupov, Socioengineering attacks. Problems of analysis, St Petersburg: Nauka Publ., 2016 (in Russian).
2019 Phishing Trends & Intelligence Report: The Growing Social Engineering Threat. [Online]. Available: https://securityboulevard.com/2019/04/2019-phishing-trends-intelligence-report-thegrowing-social-engineering-threat/
M. V. Abramov, A. A. Azarov, and A. A. Filchenkov, “Rasprostranenie socioinzhenernoj ataki zloumyshlennika na pol’zovatelej informacionnoj sistemy, predstavlennyh v vide grafa social’nyh svjazej pol’zovatelej” [Distribution of a social engineering attack by an attacker on users of an information system, presented in the form of a graph of user social connections], in Sbornik dokladov Mezhdunarodnoj konferencii po mjagkim vychislenijam i izmerenijam (SCM-2015), 2015, pp. 329–332 (in Russian).
A. O. Кhlobystova, M. V. Abramov, and A. L. Tulupyev, “An approach to estimating of criticality of social engineering attacks traces,” in Recent Research in Control Engineering and Decision Making. ICIT 2019 (Studies in Systems, Decision and Control, vol. 199), O. Dolinina, A. Brovko, V. Pechenkin,
A. Lvov, V. Zhmud, and V. Kreinovich, eds, Springer, 2019, pp. 446–456; doi: 10.1007/978-3-030-12072- 6_36
A. L. Tulupyev, A. E. Pashhenko, A. A. Azarov, T. V. Tulupyeva, “Vizual’nyj instrumentarij dlja postrienija infofrmacionnjyh modelij kompleksa ’Infomacionnaja sistema — Personal’, ispol’zujushhihsja v imitacii socioinzhenernyh atak” [Visual tools for building information models of the complex “information system — personnel” used in simulation of socio-engineering attacks], in SPIIRAS Proc., 2010, pp. 231–245 (in Russian).
A. A. Azarov, A. L. Tulupyev, N. B. Solovcov, and T. V. Tulupyeva, “SQL-predstavlenie reljacionnoverojatnostnyh modelej socio-inzhenernyh atak v zadachah rascheta agregirovannyh ocenok zashhishhennosti personala informacionnoj sistemy s uchetom vesov svjazej mezhdu pol’zovateljami” [SQL representation of relational-probabilistic models of socio-engineering attacks in the tasks of calculating aggregate assessments of the security of information system personnel, taking into account the weights of connections between users], in SPIIRAS Proc., 2013, pp. 41–53 (in Russian).
M. V. Abramov, “Avtomatizacija analiza social’nyh setej dlja ocenivanija zashhishhjonnosti ot socioinzhenernyh atak” [Automation of the analysis of social networks for assessing security against social engineering attacks], Avtomatizacija processov upravlenija, no. 1, pp. 34–40, 2018 (in Russian).
A. Suleimanov, M. V. Abramov, and A. L. Tulupyev, “Modelling of the social engineering attacks based on social graph of employees communications analysis,” in Proc. 2018 IEEE Industrial Cyber-Physical Systems (ICPS 2018), 2018, pp. 801–805; doi: 10.1109/ICPHYS.2018.8390809
A. A. Azarov, M. V. Abramov, T. V. Tulupyeva, and A. L. Tulupyev, “Analiz zashhishhjonnosti grupp pol’zovatelej informacionnoj sistemy ot socioinzhenernyh atak: princip i programmnaja realizacija” [Analysis of the security of user groups of the information system from social engineering attacks: principle and software implementation], Computer tools in education, no. 4, pp. 52–60, 2015 (in Russian).
M. V. Abramov, A. L. Tulupyev, and T. V. Tulupyeva, “Agregirovanie dannyh iz social’nyh setej dlja vosstanovlenija fragmenta meta-profilja pol’zovatelja” [Social data aggregation to restore a fragment of a user’s meta-profile], in Shestnadcataja Nacional’naja konferencija po iskusstvennomu intellektu s mezhdunarodnym uchastiem KII-2018 Trudy konferencii, 2018, pp. 189–197 (in Russian).
Statistics of social networks in Russia for 2018. [Online]. Available: https://hiconversion.ru/ blog/statistika-socialnyh-setej-v-rossii-na-2018-god/ (in Russian).
Ju. S. Trofimovich, I. S. Kozlov, and D. Ju. Turdakov, “Podhody k opredeleniju osnovnogo mesta prozhivanija pol’zovatelej social’nyh setej na osnove social’nogo grafa” [Approaches to determining the main place of residence of users of social networks based on a social graph], in Proc. ISP RAS, no. 6, 2016. [Online]. Available: https://cyberleninka.ru/article/n/podhody-k-opredeleniyuosnovnogo-mesta-prozhivaniya-polzovateley-sotsialnyh-setey-na-osnove-sotsialnogo-grafa (in Russian).
A. D. Kaveeva and K. E. Gurin, “Lokal’nye seti druzhby ‘VKontakte’: vosstanovlenie propushhennyh dannyh o gorode prozhivanija pol’zovatelej” [Local networks of friendship ‘VKontakte’: restoration of missing data on the city of residence of users], Monitoring, no. 3, 2018. [Online]. Available: https://cyberleninka.ru/article/n/lokalnye-seti-druzhby-vkontakte-vosstanovlenie-propuschennyhdannyh-o-gorode-prozhivaniya-polzovateley (in Russian).
K. E. Gurin, “Strukturirovanie setej druzhby v onlajn-soobshhestvah SMI” [Structuring friendship networks in online media communities], Diskussija, no. 6, 2016. [Online]. Available: https://cyberleninka.ru/article/n/strukturirovanie-setey-druzhby-v-onlayn-soobschestvah-smi (in Russian).
A. G. Gomzin and S. D. Kuznecov, “Metod avtomaticheskogo opredelenija vozrasta pol’zovatelej s pomoshh’ju social’nyh svjazej” [A method for automatically determining the age of users using social connections], in Proc. ISP, 2016, vol. 28, no. 6, pp. 171–184 (in Russian); doi: 10.15514/ISPRAS-2016-28(6)-12
V. S. Grezin and V. A. Novosyadly, “O probleme opredelenija vozrasta uchastnika social’noj seti” [About the problem of determining the age of a member of a social network], Izvestija vuzov. Severo-Kavkazskij region. Serija: Estestvennye nauki, no. 1, pp. 12–18, 2016. [Online]. Available: https://cyberleninka.ru/article/n/o-probleme-opredeleniya-vozrasta-uchastnika-sotsialnoy-seti (in Russian).
J. Paridhi, K. Ponnurangam, and J. Anupam, “@I seek ‘fb.me’: Identifying Users Across Multiple Online Social,” 2013 Companion: proc. of the 22nd Int. Conf. on World Wide Web, NY, USA: ACM, 2013, pp. 1259–1268; doi: 10.1145/2487788.2488160
E. Raad, R. Chbeir, and A. Dipanda, “User profile matching in social networks,” in Network-Based Information Systems (NBiS), Japan, Sep. 2010, pp. 297–304; doi: 10.1109/NBiS.2010.35
M. V. Abramov, A. A. Azarov, T. V. Tulupyeva, and A. L. Tulupyev, “Model’ profilja kompetencij zloumyshlennika v zadache analiza zashhishhjonnosti personala informacionnyh sistem ot socioinzhenernyh atak” [The model of the competence profile of an attacker in the task of analyzing the security of information systems personnel from social engineering attacks], Informacionnoupravljajushhie sistemy, no. 4. pp. 77–84, 2016 (in Russian).
N. E. Sljozkin, M. V. Abramov, and T. V. Tulupyeva, “Podhod k vosstanovleniju meta-profilja pol’zovatelja informacionnoj sistemy na osnovanii dannyh iz social’nyh setej” [An approach to recovering a meta-profile of a user of an information system based on data from social networks], Sbornik nauchnyh trudov Pervoj Vserossijskoj nauchno-prakticheskoj konferencii «Nechjotkie sistemy i mjagkie vychislenija. Promyshlennye primenenija, Ul’janovsk, Russia: UlGTU, 2017, vol. 1, pp. 394–399 (in Russian).
M. V. Abramov, N. E. Sljozkin, and T. V. Tulupyeva, “Agregacija dannyh iz social’nyh setej dlja opredelenija naibolee verojatnoj konfiguracii propushhennyh znachenij parametrov meta-profilja pol’zovatelja” [Aggregation of data from social networks to determine the most likely configuration of missing values for user meta-profile parameters], in Sbornik dokladov Mezhdunarodnoj konferencii po mjagkim vychislenijam i izmerenijam (SCM-2018), Sankt Peterburg, 2018, pp. 118–121 (in Russian).
W. E. Winkler, “String Comparator Metrics and Enhanced Decision Rules in the Fellegi-Sunter Model of Record Linkage,” in Proc. of the Section on Survey Research Methods (American Statistical Association), 1990, pp. 354–359.
A. V. Korshunov, I. K. Beloborodov, and N. Buzun, “Analiz social’nyh setej: metody i prilozhenija” [Analysis of social networks: methods and applications], in Proc. of ISP RAS, 2014, pp. 439–456 (in Russian).
RF patent No. 2011145077/08, 08/08/2011. Method for integrating profiles of online social network users. Russian Patent No. 2011145077. 2011. Bull. No. 8. Bartunov S. O., Korshunov A. V., Turdakov D. Yu. еt al. (in Russian).
L. Breiman, J. H. Friedman, R. A. Olshen, and C. T. Ston, Classification and Regression Trees, Belmont, California: Wadsworth, 1984.
M. H. Zweig and G. Campbell, “Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine,” Clinical chemistry, vol. 39, no. 4, pp. 561–577, 1993.
E. Pranskevichus, What’s New In Python 3.7. [Online]. Available: https://docs.python.org/3.7/whatsnew/3.7.html
Project Jupyter. [Online]. Available: https://jupyter.org/about
Easy-to-use data structures and data analysis tools for the Python programming language. [Online]. Available: https://pandas.pydata.org/index.html
NumPy — fundamental package for scientific computing with Python. [Online]. Available: https://numpy.org/
Python modul’ dlya napisaniya skriptov dlya sotsial’noi seti Vkontakte, (API wrapper). [Online]. Available: https://pypi.org/project/vk-api/
Odnoklassniki.ru python API wrapper. [Online]. Available: https://github.com/alternativshik/python-odnoklassniki
Python 2D plotting library Matplotlib. [Online]. Available: https://matplotlib.org/3.1.1/index.html
This work is licensed under a Creative Commons Attribution 4.0 International License.