Реализация алгоритмов машинного обучения средствами реляционных СУБД

  • Наталья Генриховна Графеева Национальный исследовательский университет ИТМО, Кронверкский пр., 49, лит. А, 197101, Санкт-Петербург, Россия
  • Артем Александрович Назаров Национальный исследовательский университет ИТМО, Кронверкский пр., 49, лит. А, 197101, Санкт-Петербург, Россия
Ключевые слова: реляционные средства управления базами данных, интеллектуальный анализ данных, технические решения

Аннотация

В статье рассматривается проблематика интеграции алгоритмов машинного обучения в реляционные СУБД. Автором проведены обзор и сравнительный анализ текущих технических возможностей реляционных СУБД Oracle, PostgreSQL, SQL Server, DB2 и MySQL, адаптированных для интеллектуального анализа данных. На основе полученных результатов сделаны выводы об уровне готовности современных СУБД к решению задачи анализа данных.

Биографии авторов

Наталья Генриховна Графеева, Национальный исследовательский университет ИТМО, Кронверкский пр., 49, лит. А, 197101, Санкт-Петербург, Россия

к.ф.-м.н., доцент, университет ИТМО, nggrafeeva@corp.ifmo.ru

Артем Александрович Назаров, Национальный исследовательский университет ИТМО, Кронверкский пр., 49, лит. А, 197101, Санкт-Петербург, Россия

Студент магистратуры, университет ИТМО, artem.a.nazarov@yandex.ru
,

Литература

V. Khudiakov, “Using DBMS in machine learning and data analysis projects,” Vestnik Nauki, no. 7,pp. 278–295, 2023 (in Russian).

R. K. Naumov, M. S. Samylkin, and M. V. Kopeikin, “Data Mining Methods Using DBMS Tools,” Research result. Information technologies, vol. 6, no. 2, pp. 32–40, 2021 (in Russian); doi:10.18413/2518-1092-2021-6-2-0-5

A. Y. Postoyko,“Neural networks integration into POSTGRESQL DBMS,”in Proc. of Aktual’nye problemy aviatsii i kosmonavtiki, Krasnoyarsk, Russia, Apr. 11-15, 2022, vol. 2, pp. 167–169, 2022 (in Russian).

E. V. Averyanova, “Means data mining Microsoft SQL SERVER,” Ekonomika i sotsium, no. 11, pp. 324–326, 2016 (in Russian).

A. D. Soboleva and O. Y. Sabinin, “Ensemble learning method development for solving the prediction problem on the example of oracle data mining technology,” Theoretical & Applied Science, vol. 59, no. 03, pp. 147–154, 2018 (in Russian); doi:10.15863/tas.2018.03.59.24

V. P. Chasovskikh and O. S. Kokh, “Greenplum DBMS for big data and machine learning,”in Proc. of BI technologies and corporate information systems in optimizing business processes in the digital economy, Ekaterinburg, Russia, Dec. 2, 2022, pp. 116–120, 2022 (in Russian).

S. Saragawi, S. Thomas, and R. Agrawal, “Integration Association Rule Mining with Relational Database Systems: Alternatives and Implications,” in Proc. of ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, US, June 2–4, 1998, pp. 343–354, 1998.

R. M. Miniakhmetov and M. L. Tsymbler, “Integration of Fuzzy c-Means Clustering algorithm with PostgreSQL database management system,” Numerical Methods and Programming, vol. 13, no. 2, pp. 46–52, 2012.

J. Han and M. Kamber, Data Mining: Concepts and Techniques, Amsterdam: Morgan Kaufmann, 2006.

M. L. Tsymbler, “Overview of Methods for Integrating Data Mining into DBMS,” Bulletin of the South Ural State University. Series “Computational Mathematics and Software Engineering”, vol. 8, no. 2, 2019 (in Russian); doi:10.14529/cmse190203

“PostgreSQL topped the global ranking of DBMS popularity growth and became the absolute leader among popular DBMSs in Russia,” in www.cnews.ru, 2024 (in Russian). [Online]. Available: https://www.cnews.ru/news/ line/2024-01-09_postgresql_vozglavila_mirovoj

The PostgreSQL Global Development Group, “PostgreSQL 16.3 Documentation,” in postgrespro.com, 2024. Online]. Available: https://postgrespro.com/docs/postgresql/16/index

L. Breiman et al., Classification and Regression Trees, Monterrey, CA, US: Wadsworth & Brooks/Cole Advanced Books & Software, 1984.

“User Documentation for Apache MADlib,” in madlib.apache.org, 2023. [Online]. Available: https://madlib.apache.org/docs/latest/index.html.

D. A. Gibadullina,“R programming language for statistical data processing,” in habr.com, 2023. [Online] (in Russian). Available: https://habr.com/ru/articles/781086/

PacktPublishing, “Machine Learning with R,” in github.com, 2022. [Online]. Available: https://github.com/PacktPublishing/Machine-Learning-with-R-Third-Edition

J. Conway, “Easy Statistical Analysis in PostgreSQL with PL/R,” in PgDay’15 Russia: the second official Russian Conference. Saint-Petersburg, July 16, 2015, [Online Presentation], 2015. Available: https://joeconway.com/ presentations/plr-DWDC-2015.05.pdf

J. Conway, “PL/R User’s Guide - R Procedural Language,” in github.com, 2023. [Online]. Available: https://github.com/postgres-plr/plr/blob/master/userguide.md

R. Druzyagin, “Statistical analysis in PostgreSQL using PL/R,” in habr.com, 2016 (in Russian). [Online]. Available: https://habr.com/ru/articles/275487/

Oracle Corp., “MySQL 8.0 Reference Manual,” in dev.mysql.com, 2024. [Online]. Available: https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html

Oracle Corp., “HeatWave User Guide,” in dev.mysql.com, 2024. [Online]. Available: https://dev.mysql.com/doc/heatwave/en/mys-hw-introduction.html

Oracle Corp., “Machine Learning for SQL Use Cases,” in docs.oracle.com, 2024. [Online]. Available:https://docs.oracle.com/en/database/oracle/machine-learning/oml4sql/21/mlsql/oracle-machine-learning-sql.html#GUID-7D00AFBD-EDED-418C-81FB-576A83CA9536

H. Moitreyee et al., “Oracle Data Miner. Installation and Administration Guide,” in docs.oracle.com, 2024. [Online]. Available: https://docs.oracle.com/en/database/oracle/sql-developer/23.1/dmrig/oracle-data-miner-installation-and-administration-guide.pdf

Oracle Corp.,“Oracle Exadata Database Machine Smart Scan,” in docs.oracle.com, 2024. [Online]. Available: https: //www.oracle.com/database/technologies/exadata/software/smartscan/

Oracle Corp., “SQL Language Reference. 7 Functions,” docs.oracle.com, 2024. [Online].

Available: https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Functions.html#

GUID-D079EFD3-C683-441F-977E-2C9503089982

Microsoft Corp., “Aggregate Functions (Transact-SQL),” in learn.microsoft.com, 2023. [Online]. Available: https://learn.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql?view=sql-server-ver16

E. Codd, S. Codd, and C. Salley, Providing OLAP to User-Analysts: An IT Mandate, US: Codd & Associates, 1993.

Microsoft Corp., “Analysis Services documentation,” in learn.microsoft.com, 2024. [Online]. Available: https://learn.microsoft.com/en-us/analysis-services

R. Panchal,“21+ Pros and Cons of Azure Analysis Services,” in thenextfind.com, 2024. [Online]. Available: https://thenextfind.com/pros-cons-of-azure-analysis-services/

IBM Corp., “DB2 Version 9.7 for Linux, UNIX, and Windows,” in www.ibm.com, 2024. [Online]. Available: https://www.ibm.com/docs/en/db2/9.7?topic=functions-user-defined

V. Varfolomeev, “Data Analysis for Decision Support (IBM DB2 Business Intelligence),” in intuit.ru, 2011 (in Russian). [Online]. Available: https://intuit.ru/studies/courses/85/85/lecture/28289?page=4

Novosoft LLC, “IBM DB2 for 1C: Enterprise,” in www.handybackup.ru, 2024. [Online]. Available: https://www.handybackup.ru/1c-db2.shtml

V. Drach, “Comparison of modern DBMS,” in drach.pro, 2017 (in Russian). [Online]. Available: https://drach.pro/blog/hi-tech/item/145-db-comparison

Опубликован
2024-08-30
Как цитировать
Графеева, Н. Г., & Назаров, А. А. (2024). Реализация алгоритмов машинного обучения средствами реляционных СУБД. Компьютерные инструменты в образовании, (2), 58-71. https://doi.org/10.32603/2071-2340-2024-2-58-71
Выпуск
Раздел
Искусственный интеллект и машинное обучение