Scientists Present New Solution to Imbalanced Learning Problem

Specialists at the HSE Faculty of Computer Science and Sber AI Lab have developed a geometric oversampling technique known as Simplicial SMOTE. Tests on various datasets have shown that it significantly improves classification performance. This technique is particularly valuable in scenarios where rare cases are crucial, such as fraud detection or the diagnosis of rare diseases. The study's results are available on ArXiv.org, an open-access archive, and will be presented at the International Conference on Knowledge Discovery and Data Mining (KDD) in summer 2025 in Toronto, Canada.
The problem of imbalanced learning is becoming increasingly relevant across various fields, including banking and medicine. Conventional methods, such as random oversampling, often generate low-quality samples or fail to accurately model rare class data.
Simplicial SMOTE (Synthetic Minority Oversampling Technique), a novel solution proposed by scientists from HSE University and Sber AI Lab, addresses these issues by enabling more accurate modelling of complex topological data structures and improving classifier performance on imbalanced datasets.
It generates new examples of a rare class by leveraging information from multiple closed instances ('simplex'), rather than just two close points, as in the original SMOTE and its well-known modifications. This facilitates a better understanding of the data and advances performance. The technique improves training on imbalanced data, where one class (eg, normal transactions) has many examples, while another class (eg, fraud) has few.
Researchers have experimentally shown on a large number of test datasets that the proposed approach achieves significantly better performance metrics, such as the F1 Score and Matthews Correlation Coefficient, for both the basic SMOTE and its modifications. In particular, an improvement was observed in gradient boosting, a classifier commonly used in practice.
'Our technique is particularly effective for tasks involving imbalanced data, where the rare class holds greater significance. Banks can use Simplicial SMOTE to detect fraud more effectively, and medical centres can apply it to diagnose rare diseases,' says Andrey Savchenko, co-author of the article and Leading Research Fellow at the Laboratories for Theoretical Modelling in AI of the HSE AI and Digital Science Institute.
The new technique can be integrated into existing oversampling algorithms (such as Borderline-SMOTE, Safe-level-SMOTE, and ADASYN), enabling better accuracy without significantly increasing computational complexity. According to the researchers, the developed approach could contribute to the creation of more accurate and reliable machine learning models, thereby improving the quality of analytics.
The study was conducted with support from the HSE Basic Research Programme.
See also:
Scientists Test Asymmetry Between Matter and Antimatter
An international team, including scientists from HSE University, has collected and analysed data from dozens of experiments on charm mixing—the process in which an unstable charm meson oscillates between its particle and antiparticle states. These oscillations were observed only four times per thousand decays, fully consistent with the predictions of the Standard Model. This indicates that no signs of new physics have yet been detected in these processes, and if unknown particles do exist, they are likely too heavy to be observed with current equipment. The paper has been published in Physical Review D.
HSE Scientists Reveal What Drives Public Trust in Science
Researchers at HSE ISSEK have analysed the level of trust in scientific knowledge in Russian society and the factors shaping attitudes and perceptions. It was found that trust in science depends more on everyday experience, social expectations, and the perceived promises of science than on objective knowledge. The article has been published in Universe of Russia.
Scientists Uncover Why Consumers Are Reluctant to Pay for Sugar-Free Products
Researchers at the HSE Institute for Cognitive Neuroscience have investigated how 'sugar-free' labelling affects consumers’ willingness to pay for such products. It was found that the label has little impact on the products’ appeal due to a trade-off between sweetness and healthiness: on the one hand, the label can deter consumers by implying an inferior taste, while on the other, it signals potential health benefits. The study findings have been published in Frontiers in Nutrition.
Final of International Yandex–HSE Olympiad in AI and Data Analysis Held at HSE University
Yandex Education and the HSE Faculty of Computer Science have announced the results of the international AIDAO (Artificial Intelligence and Data Analysis Olympiad) competition. Students from 14 countries took part. For the second year in a row, first place went to the team AI Capybara, which developed the most accurate AI model for an autonomous vehicle vision system.
AI Lingua Included in Compilation of Best International AI Practices in Higher Education
HSE University has been acknowledged internationally for its pioneering efforts in integrating artificial intelligence into higher education. The AI Lingua Neural Network developed at HSE was included in the renowned international collection ‘The Global Development of AI-Empowered Higher Education: Beyond the Horizon.’ The compilation was prepared by the Institute of Education (IOE) of Tsinghua University with the support of the Ministry of Education of the People's Republic of China and a global advisory committee, which included experts from Oxford, UCL, Sorbonne, Stanford, and other leading academic centres.
Technological Breakthrough: Research by AI and Digital Science Institute Recognised at AI Journey 2025
Researchers from the AI and Digital Science Institute (part of the HSE Faculty of Computer Science) presented cutting-edge AI studies, noted for their scientific novelty and practical relevance, at the AI Journey 2025 International Conference. A research project by Maxim Rakhuba, Head of the Laboratory for Matrix and Tensor Methods in Machine Learning, received the AI Leaders 2025 award. Aibek Alanov, Head of the Centre of Deep Learning and Bayesian Methods, was among the finalists.
HSE Psycholinguists Launch Digital Tool to Spot Dyslexia in Children
Specialists from HSE University's Centre for Language and Brain have introduced LexiMetr, a new digital tool for diagnosing dyslexia in primary school students. This is the first standardised application in Russia that enables fast and reliable assessment of children’s reading skills to identify dyslexia or the risk of developing it. The application is available on the RuStore platform and runs on Android tablets.
HSE University to Join Physical AI Garage Project by Yandex
Yandex is collaborating with leading Russian universities to launch a new educational programme called Physical AI Garage. This initiative unites five universities—HSE University, ITMO, MIPT, MAI, and MEPhI—to train future professionals in physical artificial intelligence by tackling real-world industrial challenges. The programme is free, and participants will receive scholarships.
Physicists Propose New Mechanism to Enhance Superconductivity with 'Quantum Glue'
A team of researchers, including scientists from HSE MIEM, has demonstrated that defects in a material can enhance, rather than hinder, superconductivity. This occurs through interaction between defective and cleaner regions, which creates a 'quantum glue'—a uniform component that binds distinct superconducting regions into a single network. Calculations confirm that this mechanism could aid in developing superconductors that operate at higher temperatures. The study has been published in Communications Physics.
Neural Network Trained to Predict Crises in Russian Stock Market
Economists from HSE University have developed a neural network model that can predict the onset of a short-term stock market crisis with over 83% accuracy, one day in advance. The model performs well even on complex, imbalanced data and incorporates not only economic indicators but also investor sentiment. The paper by Tamara Teplova, Maksim Fayzulin, and Aleksei Kurkin from the Centre for Financial Research and Data Analytics at the HSE Faculty of Economic Sciences has been published in Socio-Economic Planning Sciences.


