Hugging Face, an AI and machine learning platform, has acquired XetHub, a company from Seattle. XetHub specializes in scaling Git for big datasets and AI models.
The Xethub’s acquisition’s goal is to improve how Hugging Face manages and versions large datasets and models.
This move addresses the increasing demand within the AI community for better management of larger datasets and models.
“This is the real Strawberry—welcome to @xetdata. We’re just getting started!” posted Hugging Face Chief Clement Delangue on X, referring to OpenAI’s project Strawberry.
“Big models are here to stay,” said Delangue. “What we want is to make the development of AI closer to what software engineering is — make it drastically faster,” he added.
XetHub, founded in 2021 by Yucheng Low, Ajit Banerjee, and Rajat Arya, created technology for Git to manage very large repositories, helping teams work better with changing datasets and models.
Hugging Face acquired XetHub to improve storage and versioning for AI development, aiming to overcome Git LFS’s limitations with handling large AI file sizes.
XetHub’s team, according to Hugging Face CTO Julien Chaumond, will enhance growth for the next five years by upgrading their storage system to a better version of LFS.
This upgrade will improve how datasets and models are managed, making it easier for users to upload smaller changes instead of entire files.
It’s a significant improvement because AI models are getting larger, with trillion-parameter models becoming possible soon, like the BigLlama-3.1-1T.
XetHub, founded in 2021 with backing from Madrona and angel investors, was created by a team skilled in expanding AI systems, including projects with Apple’s machine learning setup.
The team plans to merge XetHub’s technology into Hugging Face, a platform used widely for AI development, to simplify collaboration among its large user community.
Yucheng Low, co-founder of XetHub, emphasized the role of data in AI advancement and expressed enthusiasm about joining Hugging Face to further their goal of enhancing large-scale AI collaboration.
Hugging Face manages a large amount of data: 1.3 million model repositories, 450,000 datasets, and 680,000 spaces, totaling 12 petabytes stored in LFS.
They acquired XetHub to handle their growing data needs more efficiently. Hugging Face is expanding its infrastructure team and actively hiring to support platform development.
Earlier, they acquired Agrilla, a Spanish startup, for $10 million. Agrilla specializes in collaborative AI software for data annotation and improving NLP with human-machine collaboration.
This acquisition enhances Hugging Face’s data annotation capabilities and incorporates human feedback into AI model training.
Stories You May Like