Ctags is a programming tool that generates an index file (or tag file) of names found in source and header files of various programming languages to aid code comprehension. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor, a code search engine, or other utility. Alternatively, there is also an output mode that generates a cross reference file, listing information about various names found in a set of language files in human-readable form.
There are a few other implementations of the ctags program:
Etags
GNU Emacs comes with two ctags utilities, etags and ctags, which are compiled from the same source code. Etags generates a tag table file for Emacs, while the ctags command is used to create a similar table in a format understood by vi. They have different sets of command line options:
etags does not recognize and ignores options which only make sense for vi style tag files produced by the ctags command.[6]
Exuberant Ctags
Exuberant Ctags, written and maintained by Darren Hiebert until 2009,[7] was initially distributed with Vim, but became a separate project upon the release of Vim 6. It includes support for Emacs and etags compatibility.[8][9]
Exuberant Ctags includes support for over 40 programming languages with the ability to add support for even more using regular expressions.
Universal Ctags
Universal Ctags is a fork of Exuberant Ctags, with the objective of continuing its development. A few parsers are rewritten to better support the languages.[10]
Language-specific
Hasktags creates ctags compatible tag files for Haskell source files.[11] It includes support for creating Emacs etags files.[12]
jsctags is a ctags-compatible code indexing solution for JavaScript.[13] It is specialized for JavaScript and uses the CommonJS packaging system. It outperforms Exuberant Ctags for JavaScript code, finding more tags than the latter.[14]
Tags file formats
There are multiple tag file formats. Some of them are described below. In the following, \x## represents the byte with hexadecimal representation ##. Every line ends with a line feed (LF, \n = \x0A).
Ctags and descendants
The original ctags and the Exuberant/Universal descendants have similar file formats:[15]
Ctags
This is the format used by vi and various clones. The tags file is normally named "tags".
The tags file is a list of lines, each line in the format:
{tagname}\t{tagfile}\t{tagaddress}
The fields are specified as follows:
{tagname} – Any identifier, not containing white space
\t – Exactly one tab (\x0b) character, although many versions of vi can handle any amount of white space.
{tagfile} – The name of the file where {tagname} is defined, relative to the current directory
{tagaddress} – An ex mode command that will take the editor to the location of the tag. For POSIX implementations of vi this may only be a search or a line number, providing added security against arbitrary command execution.
The tags file is sorted on the {tagname} field which allows for fast searching of the tags file.
Extended Ctags
This is the format used by Vim's Exuberant Ctags and Universal Ctags. These programs can generate an original ctags file format or an extended format that attempts to retain backward compatibility.
The extended tags file is a list of lines, each line in the format:
The fields up to and including {tagaddress} are the same as for ctags above.
Optional additional fields are indicated by square brackets ("[...]") and include:
;" – semicolon + double quote: Ends the {tagaddress} in a way that looks like the start of a comment to vi or ex.
{tagfield} – extension fields: tab separated "key:value" pairs for more information.
This format is compatible with non-POSIX vi as the additional data is interpreted as a comment. POSIX implementations of vi must be changed to support it, however.[15]
Etags
This is the format used by Emacs etags. The tags file is normally named "TAGS".
The etags files consists of multiple sections—one section per input source file. Sections are plain-text with several non-printable ascii characters used for special purposes. These characters are represented as underlined hexadecimal codes below.
A section starts with a two line header (the first two bytes make up a magic number):