
DeepSeek, the Chinese company specializing in artificial intelligence, has taken a further step in the development of technological tools to optimize the processing of data in large volumes. Its new proposal is Fire-Flyer File System (3FS), a parallel file system designed to improve efficiency in AI model training and inference tasks.
Data storage and access are crucial aspects in artificial intelligence environments, especially when handling large data sets and requiring fast information transfer. Traditional systems do not always meet current demands and, in this context, DeepSeek has developed 3FS as a scalable and high-performance solution.
Key Features of Fire-Flyer File System
3FS is a Linux-based distributed file system optimized for use in high-performance computing (HPC) and artificial intelligence environments. Its design enables efficient storage management, minimizing latency and improving data access.
- Optimization for modern hardware: 3FS takes full advantage of the performance of SSDs and RDMA networks, enabling read speeds of up to 6.6 TiB/s in 180-node cluster configurations.
- Parallel architecture: Its distributed design facilitates system expansion without compromising stability or access speed.
- Based on FUSE: This allows the system to run in user space without having to modify the Linux kernel, facilitating its implementation and compatibility with various distributions.
- Focus on reading speed: Prioritizing random reading over caching, which is crucial in AI models that require immediate access to large volumes of data.
A system tested in real environments
DeepSeek has been using 3FS on its own servers since 2019, allowing it to hone its performance in real-world situations. In recent tests, the system managed to 3.66 TiB/min in data sorting benchmarks and over 40 GiB/s per node for KVCache lookup tasks.
In addition, this system has been used in the company's Fire-Flyer 2 cluster, where it has allowed it to achieve performances similar to those of high-end servers such as the NVIDIA DGX-A100, but at a significantly lower cost. According to the data presented by the company, they achieved the 80% of the performance of a DGX-A100 with 50% of its cost and 60% of its energy consumption.
A boost to the open source ecosystem
One of the most striking aspects of this release is that DeepSeek has decided to release the 3FS code under the MIT license, which allows the developer community to access, modify and adapt the system according to their own needs. This openness strategy is part of the company's Open Source Week initiative, where they have released other AI-related projects.
The Fire-Flyer File System code is available on GitHub, making it easier for researchers and companies to adopt them, looking to optimize their workflows in artificial intelligence and high-performance computing.
The emergence of 3FS on the distributed file system landscape provides an alternative to existing solutions such as Ceph, which in benchmark tests achieved just 1.1 TiB/s read throughput on smaller configurations.
With this launch, DeepSeek demonstrates its commitment to technological innovation applied to artificial intelligence. By offering an efficient and accessible storage system, the company strengthens its position in the sector and provides key tools for the development of new machine learning and advanced computing models.