The following is provided as an overview of and topical guide to databases:
Database – organized collection of data, today typically in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies).
What type of things are databases?
Databases can be described as all of the following:
Information – sequence of symbols that can be interpreted as a message. Information can be recorded as signs, or transmitted as signals.
Data – values of qualitative or quantitative variables, belonging to a set of items. Data in computing (or data processing) are often represented by a combination of items organized in rows and multiple variables organized in columns. Data are typically the results of measurements and can be visualised using graphs or images.
Computer data – information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform. In this sense, data is everything in software that is not program code.
Types of databases
Active database – includes an event driven architecture (often in the form of ECA rules) which can respond to conditions both inside and outside the database.
Animation database – stores fragments of animations or human movements and which can be accessed, analyzed and queried to develop and assemble new animations.
Back-end database – accessed by users indirectly through an external application rather than by application programming stored within the database itself or by low level manipulation of the data (e.g. through SQL commands).
Bibliographic database – database of bibliographic records, an organized digital collection of references to published literature, including journal and newspaper articles, conference proceedings, reports, government and legal publications, patents, books, etc.
Centralized database – database located and maintained in one location, unlike a distributed database.
Cloud database – runs on a cloud computing platform, such as Amazon EC2, GoGrid and Rackspace.
Collection database – collection catalog of a museum or archive implemented using a computerized database, in which the institution's objects or material are catalogued.
Collective Optimization Database – open repository to enable sharing of benchmarks, data sets and optimization cases from the community, provide web services and Plug-in (computing)|plugins to analyze optimization data and predict program transformations or better hardware designs for multi-objective optimizations based on statistical and machine learning techniques provided there is enough information collected in the repository from multiple users.
Correlation database – database management system (DBMS) that is data model independent and designed to efficiently handle unplanned, ad hoc queries in an analytical system environment.
Current database – conventional database that stores data that is valid now.
Directory – repository or database of information which is optimized for reading, under the assumption that data updates are very rare compared to data reads. Commonly, a directory supports search and browsing in addition to simple lookups.
Distributed database – database in which storage devices are not all attached to a common CPU.
Document-oriented database – computer program designed for storing, retrieving, and managing document-oriented, or Semi-structured model|semi structured data, information.
EDA database – database specialized for the purpose of electronic design automation.
Endgame tablebase – computerized database that contains precalculated exhaustive analysis of a chess endgame position.
Food composition database (FCDB) – provides detailed information on the nutritional composition of foods.
Full-text database – database that contains the complete text of books, dissertations, journals, magazines, newspapers or other kinds of textual documents. Also called a "complete-text database".
Government database – collects personal information for various reasons (mass surveillance, Schengen Information System in the European Union, social security, statistics, etc.).
Graph database – uses graph structures with nodes, edges, and properties to represent and store data.
Knowledge base – special kind of database for knowledge management. A knowledge base provides a means for information to be collected, organised, shared, searched and utilised.
Mobile database – can be connected to by a mobile computing device over a mobile network.
Navigational database – database in which objects (or records) in it are found primarily by following references from other objects.
Online database – database accessible from a network, including from the Internet.
Operational database – accessed by an Operational System to carry out regular operations of an organization.
Parallel database – improves performance through parallelization of various operations, such as loading data, building indexes and evaluating queries.
Probabilistic database – uncertain database in which the possible worlds have associated probabilities.
Real-time database – processing system designed to handle workloads whose state is constantly changing (Buchmann).
Relational database – collection of data items organized as a set of formally described tables from which data can be accessed easily.
Spatial database – database that is optimized to store and query data that is related to objects in space, including points, lines and polygons.
Temporal database – database with built-in time aspects, for example a temporal data model and a temporal version of Structured Query Language (SQL).
Time series database – a time series is an associative array of numbers indexed by a datetime or a datetime range. These time series are often called profiles or curves, depending upon the market. A time series of stock prices might be called a price curve, or a time series of energy consumption might be called a load profile. Despite the disparate naming, the operations performed on them are sufficiently common as to demand special database treatment.
Triplestore – purpose-built database for the storage and retrieval of triples, a triple being a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".
Very large database (VLDB) – contains an extremely high number of tuples (database rows), or occupies an extremely large physical filesystem storage space.
Virtual private database (VPD) – masks data in a larger database so that security allows only the use of apparently private data.
Vulnerability database – platform aimed at collecting, maintaining, and disseminating information about discovered vulnerabilities targeting real computer systems.
Database theory – encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems.
Database machine – or is a computer or special hardware that stores and retrieves data from a database. Also called a "back end processor"
Database server – computer program that provides database services to other computer programs or computers, as defined by the client-server model.
Database application – computer program whose primary purpose is entering and retrieving information from a computer-managed database.
Database management system (DBMS) – software package with computer programs that control the creation, maintenance, and use of a database.
Database connection – facility in computer science that allows client software to communicate with database server software, whether on the same machine or not.
Datasource – name given to the connection set up to a database from a server. The name is commonly used when creating a query to the database. The Database Source Name (DSN) does not have to be the same as the filename for the database. For example, a database file named "friends.mdb" could be set up with a DSN of "school". Then DSN "school" would then be used to refer to the database when performing a query.
Data Source Name (DSN) – are data structures used to describe a connection to a data source. Sometimes known as a database source name though data sources are not limited to databases.
Database administrator (DBA) – is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of physical[clarification needed] databases.
Comparison of database tools – (provides tables for comparing general and technical information for a number of available database administrator tools.)
Database-centric architecture – software architectures in which databases play a crucial role. Also called "data-centric architecture".
Intelligent database – was put forward as a system that manages information (rather than data) in a way that appears natural to users and which goes beyond simple record keeping.
Two-phase locking (2PL) – is a concurrency control method that guarantees serializability.
Locks with ordered sharing – comprises several variants of the Two phase locking (2PL) concurrency control protocol generated by changing the blocking semantics of locks upon conflicts.
Load file – in the litigation community is commonly referred to as the file used to import data (coded, captured or extracted data from ESI processing) into a database; or the file used to link images.
Database publishing – area of automated media production in which specialized techniques are used to generate paginated documents from source data residing in traditional databases.
Halloween Problem – a phenomenon in databases in which an update operation causes a change in the physical location of a row, potentially allowing the row to be visited more than once during the operation.
Log shipping – process of automating the backup of a database and transaction log files on a primary (production) database server, and then restoring them onto a standby server.
Information retrieval query language – query language used to make queries into database, where the semantics of the query are defined not by a precise rendering of a formal syntax, but by an interpretation of the most suitable results of the query.
SQL (Structured Query Language) – special-purpose programming language designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS).
XQuery – a query and functional programming language that queries and transforms collections of structured and unstructured data.
Database activity monitoring (DAM) – database security technology for monitoring and analyzing database activity that operates independently of the database management system (DBMS) and does not rely on any form of native (DBMS-resident) auditing or native logs such as trace or transaction logs.
Database normalization – process of organizing the fields and tables of a relational database to minimize redundancy and dependency.
Database refactoring – simple change to a database schema that improves its design while retaining both its behavioral and informational semantics.
Database programming
Database abstraction layer – application programming interface which unifies the communication between a computer application and databases such as SQL Server, DB2, MySQL, PostgreSQL, Oracle or SQLite.
Object–relational mapping (ORM, O/RM, and O/R mapping) – in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages.
Database management
Database virtualization – it is the decoupling of the database layer, which lies between the storage and application layers within the application stack.
Database tuning – describes a group of activities used to optimize and homogenize the performance of a database.
Database caching – effective approach to achieve high scalability and performance.
Database preservation – usually involves converting the information stored in a database, without losing the characteristics (Context, Content, Structure, Appearance and Behaviour) of the data, to a format which can be used in the long term, even if the technology and daily life knowledge changes.
Database integrity – ensures that data entered into the database is accurate, valid, and consistent.
Federated database system – type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database.
Relational algebra – offshoot of first-order logic (and of algebra of sets), deals with a set of finitary relations (see also relation (database)) that is closed under certain operators.
Relational calculus – consists of two calculi, the tuple relational calculus and the domain relational calculus, that are part of the relational model for databases and provide a declarative way to specify database queries.
Relational database – collection of data items organized as a set of formally described tables from which data can be accessed easily.
Relational model – for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F.
Object–relational database (ORD) – database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. Also called object–relational database management system (ORDBMS).
Superkey – set of attributes of a relation variable for which it holds that in all relations assigned to that variable, there are no two distinct tuples (rows) that have the same values for the attributes in this set.
Surrogate key – unique identifier in a database for either an entity in the modeled world or an object in the database.
Armstrong's axioms – set of axioms (or, more precisely, inference rules) used to infer all the functional dependencies on a relational database.
NoSQL – class of database management system identified by its non-adherence to the widely used relational database management system (RDBMS) model:
Transaction log – history of actions executed by a database management system to guarantee ACID properties over crashes or hardware failures. Also called "transaction journal", "database log" or "binary log".
Database trigger – procedural code that is automatically executed in response to certain events on a particular table or view in a database.
Concurrency control – ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.
Data dictionary – as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." Also called a "metadata repository".
Query optimizer – component of a database management system that attempts to determine the most efficient way to execute a query.
Query plan – ordered set of steps used to access or modify information in a SQL relational database management system. Also called a "query execution plan".
Query optimization – function of many relational database management systems in which multiple query plans for satisfying a query are examined and a good query plan is identified.
Document-oriented database – computer program designed for storing, retrieving, and managing document-oriented, or Semi-structured model|semi structured data, information.
Database models
Database model – theoretical foundation of a database and fundamentally determines in which manner data can be stored, organized, and manipulated in a database system. It thereby defines the infrastructure offered by a particular database system. The most popular example of a database model is the relational model.
Models
Flat file database – various means to encode a database model (most commonly a table) as a single file.
Graph database – uses graph structures with nodes, edges, and properties to represent and store data.
Object database – database management system in which information is represented in the form of objects as used in object-oriented programming. Also called an "object-oriented database management system".
Triplestore – purpose-built database for the storage and retrieval of triples, a triple being a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".
Column-oriented DBMS – database management system (DBMS) that stores data tables as sections of columns of data rather than as rows of data, like most relational DBMSs.
Dimension table – one of the set of companion tables to a fact table.
Degenerate dimension – dimension key in the fact table that does not have its own dimension table, because all the interesting attributes have been placed in analytic dimensions.
Data extraction – act or process of retrieving data out of (usually unstructured or poorly structured) data sources for further data processing or data storage (data migration).
Data transformation – converts data from a source data format into destination data.
Business intelligence (BI) – is defined as the ability for an organization to take all its capabilities and convert them into knowledge, ultimately, getting the right information to the right people, at the right time, via the right channel.
Data mining – is the process that results in the discovery of new patterns in large data sets. It is the analysis step of the "Knowledge Discovery in Databases" process, or KDD.
OLAP cube – set of data, organized in a way that facilitates non-predetermined queries for aggregated information, or in other words, online analytical processing.
Gray, J. and Reuter, A. Transaction Processing: Concepts and Techniques, 1st edition, Morgan Kaufmann Publishers, 1992.
Kroenke, David M. and David J. Auer. Database Concepts. 3rd ed. New York: Prentice, 2007.
Lightstone, S.; Teorey, T.; Nadeau, T. (2007). Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more. Morgan Kaufmann Press. ISBN978-0-12-369389-1.
Teorey, T.; Lightstone, S. and Nadeau, T. Database Modeling & Design: Logical Design, 4th edition, Morgan Kaufmann Press, 2005. ISBN0-12-685352-5