Distributed learning

(Learn) One for all, and all for one.

Figure 1. Learn one for all, and learn all for one. Image credits to Marvin Meyer Marvin Meyer.

In many technological fields, from engineering to computer science, there is an effort to move from centralized solutions, where there is a single system that owns the data, reasons and acts to solve a problem, towards decentralized solutions, where the task is aplit across multiple systems. This is also the case for machine learning. The term Distributed Machine Learning refers to the idea of distributing the process of learning from the data across multiple nodes (physical or virtual). There are a lot of nuances to this idea, for example in the nature of the nodes, their location, if they are heterogenous or not, if the serve the same function or not, in the location of the data and whether it can be shared or not, in the goal of the learning procedure (learn a single model, learn personalized models), if the system aims to speed up the training, etcetera.

In Vandal there are a few researchers working on this topic, and I have been collaborating on a few projects.

Federated learning

Federated learning (also known as collaborative learning) is a sub-field the wider field of distributed machine learning focusing on settings in which multiple entities (often referred to as clients) collaboratively train a model while ensuring that their data remains decentralized (for example due to privacy constraints). The clients may also have limited computational power and availability to participate to the training. One of the primary defining characteristics of federated learning is data heterogeneity. Due to the decentralized nature of the clients’ data, there is no guarantee that data samples held by each client are independently and identically distributed.

We are working on solutions that try to address the problem of data heterogeneity, to improve the model convergence and final model quality while being efficient in terms of communication with the aggregation server.

Figure 2. Results obtained with our FedHBM and GHBM methods.

We have also worked on new applications for federated learning. In fact, many solutions developed for federated learning are tested and validated for image classification tasks and on small datasets and models. We have also proposed the first (to the best of our knowledge) application of federated learning for image retrieval, and more specifically to a visual geo-localization problem (where ther ecould be multiple robots/vehicles around the world collecting navigation data).

Figure 3. Sketch of the training procedure for Federated Visual Place Recognition.
Training parallelization

Recent breakthroughs enabled by deep learning rely on increasingly complex architectures: the advent of Large Language Models (LLMs) has marked a new chapter for conversational agents like ChatGPT, but also introduced real challenges in training and serving such solutions for real-world applications. Therefore, scaling data and computing power are crucial for successful training, but speed-up scaling laws are not going to cope forever with the limits of current algorithms and architectures.

We have been working on some projects related the parallelization and distribution of the training of these algorithm, in order to allow better scalability. One aspects that we have worked on, for the spcific application of visual place recognition, is on how to leverage a smarter data partition in order to improve and speed up the training (see Figure 4).

Figure 4. Sketch of the Distributed CosPlace algorithm for visual place recognition.

Related Publications

2024

  1. Journal
    Zaccone-2024-dCosplace.jpg
    Distributed training of CosPlace for large-scale visual place recognition
    Riccardo Zaccone, Gabriele Berton, and Carlo Masone
    Frontiers in Robotics and AI, 2024
  2. Workshop
    Dutto-2024-fedvpr.jpg
    Collaborative Visual Place Recognition through Federated Learning
    Mattia Dutto, Gabriele Berton, Debora Caldarola, Eros Fanı̀, Gabriele Trivigno, and Carlo Masone
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Jun 2024

2023

  1. Preprint
    Zaccone-2024-fedhbm.jpg
    Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum
    Riccardo Zaccone, Carlo Masone, and Marco Ciccone
    Jun 2023