A general-purpose macro processor or general purpose preprocessor is a macro processor that is not tied to or integrated with a particular language or piece of software.
A macro processor is a program that copies a stream of text from one place to another, making a systematic set of replacements as it does so. Macro processors are often embedded in other programs, such as assemblers and compilers. Sometimes they are standalone programs that can be used to process any kind of text.
Macro processors have been used for language expansion (defining new language constructs that can be expressed in terms of existing language components), for systematic text replacements that require decision making, and for text reformatting (e.g. conditional extraction of material from an HTML file).
Examples of general purpose macro processors
Name
Year
Description
GPM
1960s
One of the earliest macro processors was GPM (the General Purpose Macrogenerator).[1] This was developed at the University of Cambridge, UK, in the mid 1960s, under the direction of Christopher Strachey.
ML/I
1960s
One particularly important general purpose macro processor was (and still is) ML/I (Macro Language One). This was developed as part of PhD research by a Cambridge postgraduate, Peter J. Brown. ML/I operates on a character stream, and requires no special format for its input, nor any special flag characters to introduce macros.
STAGE2
1960s
A contemporary of ML/I was STAGE2,[2] part of William Waite's Mobile Programming System.[3] This too is a general purpose macro processor, but it processes input a line at a time, matching each line against specified patterns; it is notable in that it is independent of character set, requiring only that the digits 0-9 are contiguous and in that order (a condition not met by some of the 6-bit and BCD character codes of the era).
SNOBOL is a string processing language which is capable of doing most of the pre-processing which can be done by a macro processor.
XPOP
XPOP was another attempt at a general macro processing language by Mark Halpern at IBM in the 1960s.
TTM
1968
TTM is a recursive, interpretive language designed primarily for string manipulation, text editing, macro definition and expansion, and other applications generally classified as systems programming. It was developed in 1968 by Steven Caine and E. Kent Gordon at the California Institute of Technology. It is derived, primarily, from GAP[5] and GPM.[1]
GMP
1970s
Another attempt was the GMP (General Macro Processor) developed in the mid-1970s by M Boule in the DLB/GC department of the CII Company along ideas from R.J. Chevance. Tested in association with the Bordeaux I University the first version ran the SIRIS8/IRIS80 System. It was ported to mini6 systems and was the main component involved in the system generation for this family of computers. The GMP processor used C2-Chomsky grammars to define the syntax of macros and used an imperative language to execute computations and proceed to macro expansion.
Software: Practice and Experience, Vol. 14, pp. 519–531, Jun. 1984
gema
1995
gema is a contextual macro processor based on pattern matching, written by David N. Gray. It replaces/enhances the concept of regular expressions by contexts. Contexts roughly corresponds to named sets of patterns. As a consequence, macros in gema closely resemble an EBNF description.[7]
GPP
1996
gpp is another general macro processor written by Denis Auroux. It resembles a C preprocessor, but has more general semantics and allows for customized syntax (for instance, TeX, XHTML, and Prolog-like scripts are definable).[8]
M5
1999
m5 is a general-purpose macro processor written by William A. Ward, Jr. Unlike many macroprocessors, m5 does not directly interpret its input. Instead it uses a two-pass approach in which the first pass translates the input to an awk program, and the second pass executes the awk program to produce the final output.
pyexpander
2011
pyexpander is a general-purpose macro processor based on the Python programming language. In addition to simple macro replacement it allows evaluation of arbitrary Python expressions and execution of python code.
Text Assembler
2014
Text Assembler is a general-purpose text/macro processor based on the JavaScript programming language. Beyond simple macro replacement, it allows evaluating arbitrary JavaScript expressions and executing JavaScript code. It can also load JSON data models for more complex data-driven text processing tasks.[9]
PP
2016
PP is a text preprocessor designed for Pandoc (and more generally Markdown and reStructuredText). PP implements: Macros, literate programming, GraphViz, PlantUML and ditaa diagrams, Bash, Cmd, PowerShell, Python and Haskell scripts.[10]
minimac
minimac is a minimalist general purpose macro processor. It operates as a character stream filter, recursively expanding macros as they are encountered. It is unusual for a macro processor in that it uses an explicit argument stack, and user functions are defined by concatenation (similar to the Forth language).[11]
aa_macro
2017
aa_macro is an open-source character-stream-based text processing language written in Python. Text is processed in a left-to-right, inside-to-outside manner. A selection of pre-defined built-in functions provide fundamental processing mechanisms that may be used directly or as elements of user-defined styles. The language is user extensible, and wtfm, an open-source web-based document preparation wrapper for the language, is available.[12][13]
^Waite, William M. (July 1970). "The mobile programming system: STAGE2". Communications of the ACM. 13 (7). New York, NY, USA: ACM: 415–421. doi:10.1145/362686.362691.
^Britten, Charles Randyl (2020-06-26). "Translation of 8080 Code to 8086 - Microsoft Translation of 8080 Code to 8086 and Other 16-Bit Processors". Archived from the original on 2021-07-23. Retrieved 2021-11-28. Stage2 was created by Prof William Waite at the University of Colorado in the late sixties as a major component of his mobile programming system, MPS. Stage2 uses a pattern matching algorithm to match input lines of text against a set of templates. Each template is the first line of a macro and when a match is recognized the code body of that macro is processed to produce output text, error messages, or create a constructed line that is submitted for further template matching. So the process is fully recursive and quite powerful in its capabilities for text transformation. In fact, it can be used to implement a programming language compiler.
^Cole, A. J. (1981). Macro Processors (2nd, revised ed.). CUP Archive. p. 254.
^Farber, D. J., 635 Assembly System - GAP. Bell Telephone Laboratories Computation Center (1964).
^Kernighan, Brian W.; Plauger, P. J. (1976). Software Tools. Reading, Masschusetts: Addison-Wesley. p. 283. ISBN0-201-03669-X.