• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Group and Shuffle: Researchers at HSE University and AIRI Accelerate Neural Network Fine-Tuning

Group and Shuffle: Researchers at HSE University and AIRI Accelerate Neural Network Fine-Tuning

© iStock

Researchers at HSE University and the AIRI Institute have proposed a method for quickly fine-tuning neural networks. Their approach involves processing data in groups and then optimally shuffling these groups to improve their interactions. The method outperforms alternatives in image generation and analysis, as well as in fine-tuning text models, all while requiring less memory and training time. The results have been presented at the NeurIPS 2024 Conference.

The larger the neural network, the more challenging it becomes to quickly adapt it to a new task. Retraining a model from scratch is a time-consuming and costly process. Therefore, developers seek cost-effective ways to adapt a model to a specific task while preserving the overall quality of the original.

One such approach is fine-tuning using orthogonal matrices, which, unlike other methods, preserve the essential features of the original model. Popular alternatives, such as block-diagonal or butterfly matrices, have drawbacks: they are either limited in scope or require extensive computations.

Researchers at the HSE Faculty of Computer Science and the AIRI Institute have proposed a new method of constructing matrices, which they call Group-and-Shuffle. Instead of working with all the data at once, they divide the parameters into small groups, process each group separately, and then shuffle them together. This structure is both flexible and efficient: it enables the model to adapt more precisely to the task while requiring fewer computations and less memory.

Building on GS matrices, the researchers developed GSOFT, a new method for orthogonal fine-tuning of neural networks. Unlike previous approaches, GSOFT uses fewer parameters while maintaining training stability and quality, even with limited data. The team also introduced a two-sided version of the method—Double GSOFT—which allows simultaneous adjustment of parameters from both sides, enhancing the model’s flexibility and accuracy.

'We discovered how to construct orthogonal matrices using only two special types of matrices, instead of five or six as required by previous methods. This saves computational resources and training time,' explains Nikolay Yudin, Research Assistant at the HSE Laboratory for Matrix and Tensor Methods in Machine Learning.

The researchers tested the approach on three types of tasks. When fine-tuning the RoBERTa language model, the method outperformed others while using a comparable number of parameters. In image generation, where the model needed to preserve the original features while adapting to the user’s request, GSOFT and Double GSOFT outperformed popular methods like LoRA and BOFT, all while using less memory and training time.

Subject-driven generation visual results on 3,000 training iterations
© Gorbunov, M., Yudin, N., Soboleva, V., Alanov, A., Naumov, A., Rakhuba, M. (2024). Group and shuffle: Efficient structured orthogonal parametrization. arXiv preprint arXiv:2406.10019.

The authors also tested their approach on convolutional neural networks, which are commonly used for image and video analysis, such as in face recognition. The team adapted the GS matrices even for cases where the model required strong resistance to interference and distortion.

'We tested the method across various scenarios—from language and generative models to robust convolutional networks. In every case, it performed reliably while using fewer resources. This confirms that the method can be applied effectively to a variety of purposes,' comments Aibek Alanov, Senior Research Fellow at the Centre of Deep Learning and Bayesian Methods, AI and Digital Science Institute, HSE FCS, and leader of the Controllable Generative AI team at FusionBrain, AIRI.

See also:

When Thoughts Become Movement: How Brain–Computer Interfaces Are Transforming Medicine and Daily Life

At the dawn of the 21st century, humans are increasingly becoming not just observers, but active participants in the technological revolution. Among the breakthroughs with the potential to change the lives of millions, brain–computer interfaces (BCIs)—systems that connect the brain to external devices—hold a special place. These technologies were the focal point of the spring International School ‘A New Generation of Neurointerfaces,’ which took place at HSE University.

New Clustering Method Simplifies Analysis of Large Data Sets

Researchers from HSE University and the Institute of Control Sciences of the Russian Academy of Sciences have proposed a new method of data analysis: tunnel clustering. It allows for the rapid identification of groups of similar objects and requires fewer computational resources than traditional methods. Depending on the data configuration, the algorithm can operate dozens of times faster than its counterparts. Thestudy was published in the journal Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia.

Researchers from HSE University in Perm Teach AI to Analyse Figure Skating

Researchers from HSE University in Perm have developed NeuroSkate, a neural network that identifies the movements of skaters on video and determines the correctness of the elements performed. The algorithm has already demonstrated success with the basic elements, and further development of the model will improve its accuracy in identifying complex jumps. 

Script Differences Hinder Language Switching in Bilinguals

Researchers at the HSE Centre for Language and Brain used eye-tracking to examine how bilinguals switch between languages in response to context shifts. Script differences were found to slow down this process. When letters appear unfamiliar—such as the Latin alphabet in a Russian-language text—the brain does not immediately switch to the other language, even when the person is aware they are in a bilingual setting. The article has been published in Bilingualism: Language and Cognition.

HSE Experts Highlight Factors Influencing EV Market Growth

According to estimates from HSE University, Moscow leads in the number of charging stations for electric vehicles in Russia, while Nizhny Novgorod ranks first in terms of charging station coverage, with 11.23 electric vehicles per charging station, compared to 14.41 in Moscow. The lack of charging infrastructure is one of the key factors limiting the growth of the electric vehicle market. This is stated in the study titled ‘Socio-Economic Aspects of Introducing Electric Vehicles in Commercial Transportation’ conducted by experts from the Institute of Transport Economics and Transport Policy Studies at HSE University.

Machine Learning Links Two New Genes to Ischemic Stroke

A team of scientists from HSE University and the Kurchatov Institute used machine learning methods to investigate genetic predisposition to stroke. Their analysis of the genomes of over 5,000 people identified 131 genes linked to the risk of ischemic stroke. For two of these genes, the association was found for the first time. The paper has been published in PeerJ Computer Science.

First Digital Adult Reading Test Available on RuStore

HSE University's Centre for Language and Brain has developed the first standardised tool for assessing Russian reading skills in adults—the LexiMetr-A test. The test is now available digitally on the RuStore platform. This application allows for a quick and effective diagnosis of reading disorders, including dyslexia, in people aged 18 and older.

Low-Carbon Exports Reduce CO2 Emissions

Researchers at the HSE Faculty of Economic Sciences and the Federal Research Centre of Coal and Coal Chemistry have found that exporting low-carbon goods contributes to a better environment in Russian regions and helps them reduce greenhouse gas emissions. The study results have been published in R-Economy.

Russian Scientists Assess Dangers of Internal Waves During Underwater Volcanic Eruptions

Mathematicians at HSE University in Nizhny Novgorod and the A.V. Gaponov-Grekhov Institute of Applied Physics of the Russian Academy of Sciences studied internal waves generated in the ocean after the explosive eruption of an underwater volcano. The researchers calculated how the waves vary depending on ocean depth and the radius of the explosion source. It turns out that the strongest wave in the first group does not arrive immediately, but after a significant delay. This data can help predict the consequences of eruptions and enable advance preparation for potential threats. The article has been published in Natural Hazards. The research was carried out with support from the Russian Science Foundation (link in Russian).

Centre for Language and Brain Begins Cooperation with Academy of Sciences of Sakha Republic

HSE University's Centre for Language and Brain and the Academy of Sciences of the Republic of Sakha (Yakutia) have signed a partnership agreement, opening up new opportunities for research on the region's understudied languages and bilingualism. Thanks to modern methods, such as eye tracking and neuroimaging, scientists will be able to answer questions about how bilingualism works at the brain level.