In December 2022, Hivenet and Inria (the French Institute for Research in Computer Science and Automation) announced a four-year partnership to create a peer-to-peer cloud that is sustainable and accessible to all. This represented a significant investment from Hivenet, covering the recruitment of 8 PhD students, post-doctoral fellows, and engineers across 4 Inria project teams. At the halfway point, join us as we recap the work done so far and give you a preview of what’s to come.
It’s important not to overthink things when choosing a name, and with that in mind, the “Challenge” motivating our partnership with Inria was named Alvearium, Latin for beehive. When announcing the Challenge in early 2023, Inria defined it in terms of four problem spaces (see Table 1 below) and set the overall objectives as “to offer … a sovereign, high-performance Cloud capable of meeting users' storage needs as effectively as existing providers, while respecting the confidentiality and security of their data”. If you’ve been following Hivenet and what we do, you’ll see this aligns with our company mission.
Table 1: the four problem spaces underpinning Inria and Hivenet’s Challenge
Much has happened in the last two years since the themes in Table 1 were defined. In May 2024, the team attended Per3S in Paris to update on work to quantify the performance of Erasure Codes in peer-to-peer storage systems. More recently, in December 2024, Hivenet’s R&D team presented at the PEPR Cloud Forum in Grenoble, showcasing the amazing work done to realize our distributed storage and compute services. Later in December, the team shared findings on a new secure messaging protocol for distributed environments: “DiSCreet: Distributed Delivery Service with Context-Aware Cooperation.” The paper: 1) demonstrated a new, efficient approach to encrypted communications in a highly scalable, dynamic environment like Hivenet’s; 2) summarised follow-up research to evaluate the performance and security of the proposed mechanism in depth and formalize the proposals.
On Erasure Codes: we won’t go into detail in this article, but just know that Erasure Codes are a data protection technique that breaks data into pieces and encodes the pieces in such a way that they can be reassembled into their original form without all the pieces being available. Specifically, Hivenet uses Reed-Solomon Encoding.
In 2025, the hard work on Alvearium continues, and on February 14 at World Artificial Intelligence Cannes Festival (WAICF), Hivenet and Inria formalized a new Challenge, Cupseli. The name Cupseli derives from Κυψέλη (Ancient Greek for beehive) and is also an acronym for Collaborative Unified Platform for a Scalable and Efficient Learning Infrastructure. More details to come, but for now we can share that the focus will be on distributed computing, AI inference, and encryption. Aside from these cutting-edge research partnerships, Hivenet continues to invest time and energy into its own innovation practices, and we will share one of our favorites that you can expect to hear a lot more about very soon.
Figure 1: front row seats at WAICF for the Cupseli signing ceremony [source: Inria]
You’re probably already well aware of ChatGPT and similar tools' impact in shaping customer service, support, marketing, and other key business interactions. Well, Hivenet being Hivenet, although we recognize AI’s vast potential, we maintain a healthy skepticism towards such technology being centralized under the control of large monopolies, who are ever more aligned to state-level objectives. Unsurprisingly, we took a different approach: we built an in-house AI chat, a Retrieval-Augmented Generation (RAG) system, to power Hivenet’s internal knowledge management. RAG in Generative AI is a way to ensure that responses from LLMs are contextually grounded in up-to-date, relevant information, reducing inappropriate or hallucinated results. We ingest our documents, knowledge, chats, and so on in the system and combine these with the output of LLMs (e.g., Llama 3.3-70B, Qwen2-VL-7B, and Mistral) running on Hivenet’s distributed compute service. Please refer to Figure 1 below for a high-level overview of HiveGPT’s architecture.
On model naming conventions: when you see (e.g.) “70B” or “7B” in a model name, this describes the number of parameters (or weights) that the model possesses, with “B” representing “billion.”
In simple terms, parameters are numbers in a model that can be used to tune its accuracy. For example, the Llama 3.3-70B model contains 70 billion parameters. It’s important to note that more parameters don’t always mean “better.” Instead, the number of parameters can be read as a measure of model “complexity,” which must be balanced against the computational cost of training. A smaller model trained on high-quality data may perform more effectively than a larger one.
Figure 2: high-level architecture for Chat with Hivenet
Now, running AI inference in a distributed environment is not new. For example, frameworks such as Petals and vLLM exist to achieve this, and Chat with Hivenet leverages vLLM - a choice made because of its high throughput and its support for OpenAI APIs. However, we want to be clear that it was not built “off the shelf,” and our Research team is addressing some tough technical challenges. Firstly, Hivenet is a distributed platform where resources can come in various shapes and sizes; for example, GPUs come with different amounts of RAM. Meanwhile, vLLM requires the same memory for all cards, which our solution must work around. Secondly, not all distributed systems are equally distributed: unlike some scenarios, Hivenet’s resources can exist in geographical locations spanning multiple continents. In practice, this challenges network performance and latency, which is critical for many inference use cases.
While the current generation offers acceptable performance for Hivenet’s needs, we are clear on the work to be done - the upcoming Cupseli Challenge with Inria will include a focus on optimizing network distribution and performance through a range of approaches, including compression and bandwidth optimization. In addition, while running Llama 3.3-70B on Hivenet’s distributed compute platform is an amazing achievement, we will go further - our next challenge is to deploy a 405 billion parameter model (e.g., Llama-3.1-405B) to test the limits of our solution. It’s unproven if a model of this size can perform in a distributed environment without undue impact on speed and throughput … but we’ll find out. Finally, while the original intent was to implement an RAG-based system, we want the possibility to fall back to the underlying model for “generic” queries. We expect significant challenges with handling the injection of external context and understanding when it is appropriate to leverage the underlying model for a given query.
Our Chat remains something we’re incubating at Hivenet, but … well, it’s already running on our distributed computing platform, and we already have a secure, distributed storage service where inputs could be sourced for a RAG-based system. On top of that, Hivenet's solution can run everywhere (including a desktop machine) and interact with models deployed anywhere via the model router, without additional network or firewall configuration. We won’t say more yet, but you can glimpse where this may lead.
We hope you enjoyed this overview of the cutting-edge work underway at Hivenet in partnership with Inria. As ever, we will continue to share more as the story unfolds - in the meantime, you can always get in touch with any questions.