Figure 1. Learn one for all, and learn all for one. Image credits to Marvin Meyer Marvin Meyer.
In many technological fields, from engineering to computer science, there is an effort to move from centralized solutions, where there is a single system that owns the data, reasons and acts to solve a problem, towards decentralized solutions, where the task is aplit across multiple systems. This is also the case for machine learning. The term Distributed Machine Learning refers to the idea of distributing the process of learning from the data across multiple nodes (physical or virtual). There are a lot of nuances to this idea, for example in the nature of the nodes, their location, if they are heterogenous or not, if the serve the same function or not, in the location of the data and whether it can be shared or not, in the goal of the learning procedure (learn a single model, learn personalized models), if the system aims to speed up the training, etcetera.
In Vandal there are a few researchers working on this topic, and I have been collaborating on a few projects.
Federated learning
Federated learning (also known as collaborative learning) is a sub-field the wider field of distributed machine learning focusing on settings in which multiple entities (often referred to as clients) collaboratively train a model while ensuring that their data remains decentralized (for example due to privacy constraints). The clients may also have limited computational power and availability to participate to the training. One of the primary defining characteristics of federated learning is data heterogeneity. Due to the decentralized nature of the clients’ data, there is no guarantee that data samples held by each client are independently and identically distributed.
We are working on solutions that try to address the problem of data heterogeneity, to improve the model convergence and final model quality while being efficient in terms of communication with the aggregation server.
Figure 2. Results obtained with our FedHBM and GHBM methods.
We have also worked on new applications for federated learning. In fact, many solutions developed for federated learning are tested and validated for image classification tasks and on small datasets and models. We have also proposed the first (to the best of our knowledge) application of federated learning for image retrieval, and more specifically to a visual geo-localization problem (where ther ecould be multiple robots/vehicles around the world collecting navigation data).
Recent breakthroughs enabled by deep learning rely on increasingly complex architectures: the advent of Large Language Models (LLMs) has marked a new chapter for conversational agents like ChatGPT, but also introduced real challenges in training and serving such solutions for real-world applications. Therefore, scaling data and computing power are crucial for successful training, but speed-up scaling laws are not going to cope forever with the limits of current algorithms and architectures.
We have been working on some projects related the parallelization and distribution of the training of these algorithm, in order to allow better scalability. One aspects that we have worked on, for the spcific application of visual place recognition, is on how to leverage a smarter data partition in order to improve and speed up the training (see Figure 4).
Figure 4. Sketch of the Distributed CosPlace algorithm for visual place recognition.
Related Publications
2024
Journal
Distributed training of CosPlace for large-scale visual place recognition
Riccardo Zaccone, Gabriele Berton, and Carlo Masone
Visual place recognition (VPR) is a popular computer vision task aimed at recognizing the geographic location of a visual query, usually within a tolerance of a few meters. Modern approaches address VPR from an image retrieval standpoint using a kNN on top of embeddings extracted by a deep neural network from both the query and images in a database. Although most of these approaches rely on contrastive learning, which limits their ability to be trained on large-scale datasets (due to mining), the recently reported CosPlace proposes an alternative training paradigm using a classification task as the proxy. This has been shown to be effective in expanding the potential of VPR models to learn from large-scale and fine-grained datasets. In this work, we experimentally analyze CosPlace from a continual learning perspective and show that its sequential training procedure leads to suboptimal results. As a solution, we propose a different formulation that not only solves the pitfalls of the original training strategy effectively but also enables faster and more efficient distributed training. Finally, we discuss the open challenges in further speeding up large-scale image retrieval for VPR.
Workshop
Collaborative Visual Place Recognition through Federated Learning
Mattia Dutto, Gabriele Berton, Debora Caldarola, Eros Fanı̀, Gabriele Trivigno, and Carlo Masone
In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun 2024
Visual Place Recognition (VPR) aims to estimate the location of an image by treating it as a retrieval problem. VPR uses a database of geo-tagged images and leverages deep neural networks to extract a global representation called descriptor from each image. While the training data for VPR models often originates from diverse geographically scattered sources (geo-tagged images) the training process itself is typically assumed to be centralized. This research revisits the task of VPR through the lens of Federated Learning (FL) addressing several key challenges associated with this adaptation. VPR data inherently lacks well-defined classes and models are typically trained using contrastive learning which necessitates a data mining step on a centralized database. Additionally client devices in federated systems can be highly heterogeneous in terms of their processing capabilities. The proposed FedVPR framework not only presents a novel approach for VPR but also introduces a new challenging and realistic task for FL research. This has the potential to spur the application of FL to other image retrieval tasks.
2023
Preprint
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum
Federated Learning (FL) has emerged as the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. However, system and statistical challenges hinder real-world applications, which demand efficient learning from edge devices and robustness to heterogeneity. Despite significant research efforts, existing approaches (i) are not sufficiently robust, (ii) do not perform well in large-scale scenarios, and (iii) are not communication efficient. In this work, we propose a novel Generalized Heavy-Ball Momentum (GHBM), motivating its principled application to counteract the effects of statistical heterogeneity in FL. Then, we present FedHBM as an adaptive, communication-efficient by-design instance of GHBM. Extensive experimentation on vision and language tasks, in both controlled and realistic large-scale scenarios, provides compelling evidence of substantial and consistent performance gains over the state of the art.