ArangoDB is a graph database system developed by ArangoDB Inc. ArangoDB is a multi-model database system since it supports three data models (graphs, JSON documents, key/value)[1] with one database core and a unified query language AQL (ArangoDB Query Language). AQL is mainly a declarative language[2] and allows the combination of different data access patterns in a single query.[3]
ArangoDB is a NoSQL database system[4] but AQL is similar in many ways to SQL,[5] it uses RocksDB as a storage engine.
History
ArangoDB GmbH was founded in 2014 by Claudius Weinberger and Frank Celler.[6] They originally called the database system “A Versatile Object Container", or AVOC for short, leading them to call the database AvocadoDB.[7][8][9] Later, they changed the name to ArangoDB.[10] The word "arango" refers to a little-known avocado variety grown in Cuba.[11]
In January 2017 ArangoDB raised a seed round investment of 4.2 million Euros led by Target Partners. In March 2019 ArangoDB raised 10 million dollars in series A funding[12] led by Bow Capital. In October 2021 ArangoDB raised 27.8 million dollars in series B funding led by Iris Capital.[13]
Release history
Release
First Release
Latest Minor Version
Latest Release
Feature Notes
Reference
3.11
2023-05-30
3.11.5
2023-11-09
Faster query performance across search and graph.
Data science and analytics operational enhancements.
Improved user experience for database administration and management.
Collections replicated on all cluster nodes can be combined with graphs sharded by document attributes to enable more local execution of graph queries ("Hybrid SmartGraphs", "Hybrid Disjoint SmartGraphs").
Language-agnostic tokenization of text ("Segmentation Analyzer").
Graph traversal algorithms to enumerate all paths between two vertices ("k Paths") and to emit paths in order of increasing edge weights ("Weighted Traversals").
Support for sliding window queries to aggregate adjacent documents, value ranges and time intervals.
Geo-spatial queries can be combined with full-text search.
Flexible data field pre-processing with custom queries ("AQL Analyzer") and the ability to chain built-in and custom analyzers ("Pipeline Analyzer").
Option to store all collections of a database on a single cluster node, to combine the performance of a single server and ACID semantics with a fault-tolerant cluster setup ("OneShard").
Parallel execution of queries on several cluster nodes.
Late document materialization to only fetch the relevant documents from SORT/LIMIT queries and early pruning of non-matching documents in full collection scans.
Inlining of certain subqueries to improve execution time.
JSON: ArangoDB uses JSON as a default storage format,[14] but internally it uses ArangoDB VelocyPack – a fast and compact binary format for serialization and storage.[15] ArangoDB can natively store a nested JSON object as a data entry inside a collection. Therefore, there is no need to disassemble the resulting JSON objects. Thus, the stored data would simply inherit the tree structure of the JSON data.
Predictable performance: ArangoDB is written mainly in C++[16] and manages its own memory to avoid unpredictable performance arising from garbage collection.
Scaling: ArangoDB provides scaling through clustering.[17]
Kubernetes: ArangoDB runs on Kubernetes, including cloud-based Kubernetes services Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), and Microsoft Azure Kubernetes Service (AKS).[19]
Microservices: ArangoDB provides integration with native JavaScript microservices directly on top of the DBMS using the Foxx framework.[20]
Multiple query languages: The database has its own query language, AQL (ArangoDB Query Language), and also provides GraphQL to write flexible native web services directly on top of the DBMS.[21]
Search: ArangoDB's search engine combines boolean retrieval capabilities with generalized ranking components allowing for data retrieval based on a precise vector space model.[22]
Pregel algorithm: Pregel is a system for large scale graph processing.[23] Pregel is implemented in ArangoDB and can be used with predefined algorithms, e.g. PageRank, Single-Source Shortest Path and Connected components.[24]
Transactions: ArangoDB supports user-definable transactions. Transactions in ArangoDB are atomic, consistent, isolated, and durable (ACID), but only if data is not sharded.[25]
AQL (ArangoDB Query Language) is the SQL-like query language[26] used in ArangoDB. It supports CRUD operations for both documents (nodes) and edges, but it is not a data definition language (DDL). AQL does support geospatial queries.
// Return every document in a collectionFORdocINcollectionRETURNdoc// Count the number of documents in a collectionFORdocINcollectionCOLLECTWITHCOUNTINTOlengthRETURNlength// Add a new document into our collectionINSERT{_key:"john",name:"John",age:45}INTOcollection// Update document with key of “john” to have age 46.UPDATE{_key:"john",age:46}INcollection// Add an attribute numberOfLogins for all users with status active:FORuINusersFILTERu.active==trueUPDATEuWITH{numberOfLogins:0}INusers
Editions
Community Edition: ArangoDB Community Edition is a graph database with native multi-model database capabilities written mainly in C++ and was available under an open-source license (Apache 2) until October 2023. It was then changed to "ArangoDB Community License, which limits its use for commercial purposes and imposes a 100GB limit on dataset size within a single cluster" [27]
Commercial self-managed: ArangoDB Enterprise is a paid subscription that includes graph-aware sharding (called “SmartGraphs”)[28] and collection replication (called “Satellite Collections”) to reduce query times,[29] and increased security.[30]
Cloud: ArangoDB is offered as a cloud service called Oasis, providing ArangoDB databases as a Service (DBaaS). ArangoDB Oasis provides the functionality of an ArangoDB cluster deployment while minimizing the amount of administrative effort required.[31] ArangoDB Oasis run on multiple cloud service providers, include AWS, Azure, and Google Cloud.[32]