The name Gluster comes from the combination of the terms GNU and cluster.[2] Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code.
Gluster based its product on GlusterFS, an open-source software-based network-attachedfilesystem that deploys on commodity hardware.[5] The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO.[6]
In May 2010 Ben Golub became the president and chief executive officer.[7][8]
Red Hat became the primary author and maintainer of the GlusterFS open-source project after acquiring the Gluster company in October 2011.[4]
The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Cephfile system technology.[9]
Red Hat Gluster Storage is in the retirement phase of its lifecycle with a end of support life date of December 31, 2024.[10]
Architecture
The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage (configured as direct-attached storage, JBOD, or using a storage area network) is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.
Public cloud deployment
For public cloud deployments, GlusterFS offers an Amazon Web Services (AWS) Amazon Machine Image (AMI), which is deployed on Elastic Compute Cloud (EC2) instances rather than physical servers and the underlying storage is Amazon's Elastic Block Storage (EBS).[11] In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.
Private cloud deployment
A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running hypervisors such as KVM, Xen, or VMware; or on bare metal.[12]
GlusterFS is a scale-outnetwork-attached storagefile system. It has found applications including cloud computing, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011.[15]
In June 2012, Red Hat Storage Server was announced as a commercially supported integration of GlusterFS with Red Hat Enterprise Linux.[16] Red Hat bought Inktank Storage in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage".[17]
Design
GlusterFS aggregates various storage servers over Ethernet or InfinibandRDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License (GPL) v3 while others are dual licensed under either GPL v2 or the Lesser General Public License (LGPL) v3. GlusterFS is based on a stackable user space design.
GlusterFS has a client and server component. Servers are typically deployed as storage bricks, with each server running a glusterfsd daemon to export a local file system as a volume. The glusterfs client process, which connects to servers with a custom protocol over TCP/IP, InfiniBand or Sockets Direct Protocol, creates composite virtual volumes from multiple remote servers using stackable translators. By default, files are stored whole, but striping of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the FUSE mechanism or using NFSv3 protocol using a built-in server translator, or access the volume via the gfapi client library. The client may re-export a native-protocol mount, for example via the kernel NFSv4 server, SAMBA, or the object-based OpenStack Storage (Swift) protocol using the "UFO" (Unified File and Object) translator.
The GlusterFS server is intentionally kept simple: it exports an existing directory as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic hashing algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several petabytes on commodity hardware by avoiding bottlenecks that normally affect more tightly coupled distributed file systems.
GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and geo-replication.[18] Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a master-slave model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure.
GlusterFS has been used as the foundation for academic research[19][20]
and a survey article.[21]
Red Hat markets the software for three markets: "on-premises", public cloud and "private cloud".[22]
^Chellani, Hitesh (2007-05-12). "Roadmap and support questions". gluster-devel (Mailing list). Retrieved 31 July 2022. Z Research was officially formed in June 2005 by AB (Anand Babu) aka "rooty" who is the CTO and myself with the goal of commoditizing Supercomputing and Superstorage and in the process validating yet another a business model around "Free Software", thus evangelizing "Free Software" and promoting the fact building businesses around "Free Software" is the way forward.