Numerical experiment of computational capabilities of modern chat-bots in solving problems in mathematical analysis and computational mathematics
Abstract
The paper describes a numerical experiment on calculation of mathematical problems by chatbots (Yandex GPT 2, ChatGPT 3.5, Gemini, Copilot) on some topics of mathematical analysis (limits, derivatives, integrals), including 693 problems, and computational mathematics (solution of nonlinear equations, solution of systems of linear equations, interpolation of functions, numerical integration), consisting of 45 problems. The main characteristics of modern virtual assistants are considered. A review of research on the application of artificial intelligence in solving mathematical problems on various tests and data sets is presented. The paper considers the shortcomings manifested in the work of chatbots, analyzes their performance on specific data sets. A comparative analysis of the number of correctly solved problems in the considered systems is carried out. The main problems that can be encountered when solving computational mathematics problems in detail in each of the chatbots are discussed. This study may be of practical interest for researchers, developers, teachers and users who use these virtual assistants in their work. The conducted experiment will allow to better evaluate the effectiveness of the application of the considered systems in the field of mathematics.
References
Zemčík M.T. A Brief History of Chatbots // DEStech Transactions on Computer Science and Engineering, 2019. T.10. doi: 10.12783/dtcse/aicae2019/31439
YandexGPT2. https://ya.ru/alisa_davay_pridumaem?utm_source=landing (дата обращения: 15.02.2024)
ChatGPT. https://chat.openai.com/ (дата обращения: 15.02.2024)
Gemini. https://gemini.google.com/ (дата обращения: 15.02.2024)
Microsoft Copilot in Bing. https://www.bing.com/chat/ (дата обращения: 15.02.2024)
Как Яндекс применил генеративные нейросети для поиска ответов. https://habr.com/ru/companies/yandex/articles/561924/ (дата обращения: 05.04.2024).
What is ChatGPT? https://help.openai.com/en/articles/6783457-what-is-chatgpt (дата обращения: 15.04.2024)
How ChatGPT and our language models are developed. https://help.openai.com/en/articles/7842364-how-chatgpt-and-our-language-models-are-developed (дата обращения: 15.04.2024)
Collins E., Ghahramani Z. LaMDA: our breakthrough conversation technology // Google AI Blog, 2021. https://blog.google/technology/ai/lamda/ (дата обращения: 07.04.2024)
Thoppilan R. et al. Lamda: Language models for dialog applications // arXiv, preprint arXiv: 2201.08239, 2022.
Mehdi Y. Reinventing search with a new AI-powered Microsoft Bing and Edge, your copilot for the web // Official Microsoft Blog, 2023. https://blogs.microsoft.com/blog/2023/02/07/reinventing-search-with-a-new-ai-powered-microsoft-bing-and-edge-your-copilot-for-the-web/ (дата обращения: 07.04.2024)
What is Bing Chat, and How Can You Use It? // Official Microsoft website, 2023. https://www.microsoft.com/en-us/bing/do-more-with-ai/what-is-bing-chat-and-how-can-you-use-it?form=MA13KP (дата обращения: 07.04.2024)
Дроздов А. И. Применение нейронных сетей в задачах математического анализа // Компьютерные системы и сети : сборник статей 59-й научной конференции аспирантов, магистрантов и студентов. Минск, 2023. С. 473–479. https://libeldoc.bsuir.by/handle/123456789/52747 (дата обращения: 18.03.2024)
Shakarian P. et al. An independent evaluation of ChatGPT on mathematical word problems (MWP) // arXiv, preprint arXiv:2302.13814, 2023.
Novak D. Analyzing the GPT-3 AI’s Ability to Predict the Answer to Algebraical Questions // Journal of Student Research, 2023. Т. 12. №. 1. doi: 10.47611/jsrhs.v12i1.3998
Plevris V., Papazafeiropoulos G., Jiménez Rios A. Chatbots Put to the Test in Math and Logic Problems: A Comparison and Assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard // AI, 2023. Т. 4. №. 4. С. 949-969. doi:10.3390/ai4040048
Van Long P. P. et al. ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions // arXiv, preprint arXiv:2312.01661, 2023
Frieder S. et al. Mathematical capabilities of chatgpt // arXiv, preprint arXiv:2301.13867, 2023
Dao X. Q., Le N. B. Investigating the effectiveness of chatgpt in mathematical reasoning and problem solving: Evidence from the vietnamese national high school graduation examination // arXiv, preprint arXiv:2306.06331, 2023
Davis E., Aaronson S. Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems // arXiv, preprint arXiv:2308.05713, 2023
Кузнецов Л.А. Сборник заданий по высшей математике (типовые расчеты).- М.: «Высшая школа», 1994
Зенков А. В. Вычислительная математика для IT-специальностей : учебное пособие. Москва ; Вологда : Инфра-Инженерия, 2022
Зализняк В. Е. Теория и практика по вычислительной математике : учеб. пособие. Красноярск : Сиб. федер. ун-т, 2012
Villalobos P. et al. Will we run out of data? an analysis of the limits of scaling datasets in machine learning //arXiv, preprint arXiv:2211.04325, 2022
This work is licensed under a Creative Commons Attribution 4.0 International License.