Contrary to many MongoDB deployments, we primarily use it for storing files in GridFS. We switched over to MongoDB after searching for a good distributed file system for years. Prior to MongoDB we used a regular NFS share, sitting on top of a HAST-device. That worked great, but it didn’t allow us to scale horizontally the way a distributed file system allows.
No doubt GridFS is a useful feature of MongoDB, but I’m pretty sure the experts in distributed file systems have better solutions for this—I just hope they’ll share it with us.
Update: Jeff Darcy1:
Yes, we do have better solutions for this particular kind of use case. So do object/blob stores like Swift.
Honestly, I don’t think the “searching for a good distributed filesystem” part is even credible. How can someone be that bad at finding readily available information? For example, it’s easier to set up sharding and replication with GlusterFS than with MongoDB and GridFS, plus you’ll get striping and RDMA and generally better performance for this type of workload. On top of all that, you won’t need to use special libraries to interface with it because it’s a regular POSIX filesystem. Lastly, it’s not like there hasn’t been a lot of press about it. Even considering their obvious FreeBSD bias and the fact that FreeBSD is weak in this area, the second i tem for “FreeBSD distributed filesystem” points to GlusterFS. If they didn’t find it, they just didn’t look very hard before they reached for the New Shiny.
It’s not just GlusterFS, either. MogileFS might not be a real filesystem but it’s user space so it would probably run just fine in their environment - as would the aforementioned Swift. I have more of a problem with the anti-Mongo haters than with Mongo itself, it’s wonderful that these guys found a Mongo-based solution that works for them, but it seems like a bit of an odd choice nonetheless.
Original title and link: MongoDB Replica Sets and Sharding for GridFS as a Distributed File System ( ©myNoSQL)