Implementation of machine learning algorithms by means of relational database management systems

  • Natalia Grafeeva ITMO University, 49 Kronverksky, bldg. A, 197101, Saint Petersburg, Russia
  • Artem Nazarov ITMO University, 49 Kronverksky, bldg. A, 197101, Saint Petersburg, Russia
Keywords: relational database management systems, data mining, technical solutions

Abstract

The article discusses the problems of integrating machine learning algorithms into relational database management systems. The author conducted a review and comparative analysis of the current technical capabilities of relational database management systems like Oracle, PostgreSQL, SQL Server, DB2 and MySQL, which were adapted to data mining. Based on the obtained results, conclusions about the level of readiness of modern database management systems to solve the problem of data analysis have been drawn.

Author Biographies

Natalia Grafeeva, ITMO University, 49 Kronverksky, bldg. A, 197101, Saint Petersburg, Russia

Candidate of Physical and Mathematical Sciences, Associate Professor, ITMO University, nggrafeeva@corp.ifmo.ru

Artem Nazarov, ITMO University, 49 Kronverksky, bldg. A, 197101, Saint Petersburg, Russia

Master’s degree student, ITMO University, artem.a.nazarov@yandex.ru
,

References

V. Khudiakov, “Using DBMS in machine learning and data analysis projects,” Vestnik Nauki, no. 7,pp. 278–295, 2023 (in Russian).

R. K. Naumov, M. S. Samylkin, and M. V. Kopeikin, “Data Mining Methods Using DBMS Tools,” Research result. Information technologies, vol. 6, no. 2, pp. 32–40, 2021 (in Russian); doi:10.18413/2518-1092-2021-6-2-0-5

A. Y. Postoyko,“Neural networks integration into POSTGRESQL DBMS,”in Proc. of Aktual’nye problemy aviatsii i kosmonavtiki, Krasnoyarsk, Russia, Apr. 11-15, 2022, vol. 2, pp. 167–169, 2022 (in Russian).

E. V. Averyanova, “Means data mining Microsoft SQL SERVER,” Ekonomika i sotsium, no. 11, pp. 324–326, 2016 (in Russian).

A. D. Soboleva and O. Y. Sabinin, “Ensemble learning method development for solving the prediction problem on the example of oracle data mining technology,” Theoretical & Applied Science, vol. 59, no. 03, pp. 147–154, 2018 (in Russian); doi:10.15863/tas.2018.03.59.24

V. P. Chasovskikh and O. S. Kokh, “Greenplum DBMS for big data and machine learning,”in Proc. of BI technologies and corporate information systems in optimizing business processes in the digital economy, Ekaterinburg, Russia, Dec. 2, 2022, pp. 116–120, 2022 (in Russian).

S. Saragawi, S. Thomas, and R. Agrawal, “Integration Association Rule Mining with Relational Database Systems: Alternatives and Implications,” in Proc. of ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, US, June 2–4, 1998, pp. 343–354, 1998.

R. M. Miniakhmetov and M. L. Tsymbler, “Integration of Fuzzy c-Means Clustering algorithm with PostgreSQL database management system,” Numerical Methods and Programming, vol. 13, no. 2, pp. 46–52, 2012.

J. Han and M. Kamber, Data Mining: Concepts and Techniques, Amsterdam: Morgan Kaufmann, 2006.

M. L. Tsymbler, “Overview of Methods for Integrating Data Mining into DBMS,” Bulletin of the South Ural State University. Series “Computational Mathematics and Software Engineering”, vol. 8, no. 2, 2019 (in Russian); doi:10.14529/cmse190203

“PostgreSQL topped the global ranking of DBMS popularity growth and became the absolute leader among popular DBMSs in Russia,” in www.cnews.ru, 2024 (in Russian). [Online]. Available: https://www.cnews.ru/news/ line/2024-01-09_postgresql_vozglavila_mirovoj

The PostgreSQL Global Development Group, “PostgreSQL 16.3 Documentation,” in postgrespro.com, 2024. Online]. Available: https://postgrespro.com/docs/postgresql/16/index

L. Breiman et al., Classification and Regression Trees, Monterrey, CA, US: Wadsworth & Brooks/Cole Advanced Books & Software, 1984.

“User Documentation for Apache MADlib,” in madlib.apache.org, 2023. [Online]. Available: https://madlib.apache.org/docs/latest/index.html.

D. A. Gibadullina,“R programming language for statistical data processing,” in habr.com, 2023. [Online] (in Russian). Available: https://habr.com/ru/articles/781086/

PacktPublishing, “Machine Learning with R,” in github.com, 2022. [Online]. Available: https://github.com/PacktPublishing/Machine-Learning-with-R-Third-Edition

J. Conway, “Easy Statistical Analysis in PostgreSQL with PL/R,” in PgDay’15 Russia: the second official Russian Conference. Saint-Petersburg, July 16, 2015, [Online Presentation], 2015. Available: https://joeconway.com/ presentations/plr-DWDC-2015.05.pdf

J. Conway, “PL/R User’s Guide - R Procedural Language,” in github.com, 2023. [Online]. Available: https://github.com/postgres-plr/plr/blob/master/userguide.md

R. Druzyagin, “Statistical analysis in PostgreSQL using PL/R,” in habr.com, 2016 (in Russian). [Online]. Available: https://habr.com/ru/articles/275487/

Oracle Corp., “MySQL 8.0 Reference Manual,” in dev.mysql.com, 2024. [Online]. Available: https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html

Oracle Corp., “HeatWave User Guide,” in dev.mysql.com, 2024. [Online]. Available: https://dev.mysql.com/doc/heatwave/en/mys-hw-introduction.html

Oracle Corp., “Machine Learning for SQL Use Cases,” in docs.oracle.com, 2024. [Online]. Available:https://docs.oracle.com/en/database/oracle/machine-learning/oml4sql/21/mlsql/oracle-machine-learning-sql.html#GUID-7D00AFBD-EDED-418C-81FB-576A83CA9536

H. Moitreyee et al., “Oracle Data Miner. Installation and Administration Guide,” in docs.oracle.com, 2024. [Online]. Available: https://docs.oracle.com/en/database/oracle/sql-developer/23.1/dmrig/oracle-data-miner-installation-and-administration-guide.pdf

Oracle Corp.,“Oracle Exadata Database Machine Smart Scan,” in docs.oracle.com, 2024. [Online]. Available: https: //www.oracle.com/database/technologies/exadata/software/smartscan/

Oracle Corp., “SQL Language Reference. 7 Functions,” docs.oracle.com, 2024. [Online].

Available: https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Functions.html#

GUID-D079EFD3-C683-441F-977E-2C9503089982

Microsoft Corp., “Aggregate Functions (Transact-SQL),” in learn.microsoft.com, 2023. [Online]. Available: https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql?view=sql-server-ver16

E. Codd, S. Codd, and C. Salley, Providing OLAP to User-Analysts: An IT Mandate, US: Codd & Associates, 1993.

Microsoft Corp., “Analysis Services documentation,” in learn.microsoft.com, 2024. [Online]. Available: https://learn.microsoft.com/en-us/analysis-services

R. Panchal,“21+ Pros and Cons of Azure Analysis Services,” in thenextfind.com, 2024. [Online]. Available: https://thenextfind.com/pros-cons-of-azure-analysis-services/

IBM Corp., “DB2 Version 9.7 for Linux, UNIX, and Windows,” in www.ibm.com, 2024. [Online]. Available: https://www.ibm.com/docs/en/db2/9.7?topic=functions-user-defined

V. Varfolomeev, “Data Analysis for Decision Support (IBM DB2 Business Intelligence),” in intuit.ru, 2011 (in Russian). [Online]. Available: https://intuit.ru/studies/courses/85/85/lecture/28289?page=4

Novosoft LLC, “IBM DB2 for 1C: Enterprise,” in www.handybackup.ru, 2024. [Online]. Available: https://www.handybackup.ru/1c-db2.shtml

V. Drach, “Comparison of modern DBMS,” in drach.pro, 2017 (in Russian). [Online]. Available: https://drach.pro/blog/hi-tech/item/145-db-comparison

Published
2024-08-30
How to Cite
Grafeeva, N., & Nazarov, A. (2024). Implementation of machine learning algorithms by means of relational database management systems. Computer Tools in Education, (2), 58-71. https://doi.org/10.32603/2071-2340-2024-2-58-71
Section
Artificial intelligence and machine learning