Usually, the PC is incremented after fetching an instruction, and holds the memory address of ("points to") the next instruction that would be executed.[6][nb 2]
Processors usually fetch instructions sequentially from memory, but control transfer instructions change the sequence by placing a new value in the PC. These include branches (sometimes called jumps), subroutine calls, and returns. A transfer that is conditional on the truth of some assertion lets the computer follow a different sequence under different conditions.
A branch provides that the next instruction is fetched from elsewhere in memory. A subroutine call not only branches but saves the preceding contents of the PC somewhere. A return retrieves the saved contents of the PC and places it back in the PC, resuming sequential execution with the instruction following the subroutine call.
Hardware implementation
In a simple central processing unit (CPU), the PC is a digital counter (which is the origin of the term "program counter") that may be one of several hardware registers. The instruction cycle[8] begins with a fetch, in which the CPU places the value of the PC on the address bus to send it to the memory. The memory responds by sending the contents of that memory location on the data bus. (This is the stored-program computer model, in which a single memory space contains both executable instructions and ordinary data.[9]) Following the fetch, the CPU proceeds to execution, taking some action based on the memory contents that it obtained. At some point in this cycle, the PC will be modified so that the next instruction executed is a different one (typically, incremented so that the next instruction is the one starting at the memory address immediately following the last memory location of the current instruction).
Like other processor registers, the PC may be a bank of binary latches, each one representing one bit of the value of the PC.[10] The number of bits (the width of the PC) relates to the processor architecture. For instance, a “32-bit” CPU may use 32 bits to be able to address 232 units of memory. On some processors, the width of the program counter instead depends on the addressable memory; for example, some AVR microcontrollers have a PC which wraps around after 12 bits.[11]
If the PC is a binary counter, it may increment when a pulse is applied to its COUNT UP input, or the CPU may compute some other value and load it into the PC by a pulse to its LOAD input.[12]
To identify the current instruction, the PC may be combined with other registers that identify a segment or page. This approach permits a PC with fewer bits by assuming that most memory units of interest are within the current vicinity.
Consequences in machine architecture
Use of a PC that normally increments assumes that what a computer does is execute a usually linear sequence of instructions. Such a PC is central to the von Neumann architecture. Thus programmers write a sequential control flow even for algorithms that do not have to be sequential. The resulting “von Neumann bottleneck” led to research into parallel computing,[13] including non-von Neumann or dataflow models that did not use a PC; for example, rather than specifying sequential steps, the high-level programmer might specify desired function and the low-level programmer might specify this using combinatory logic.
This research also led to ways to making conventional, PC-based, CPUs run faster, including:
Pipelining, in which different hardware in the CPU executes different phases of multiple instructions simultaneously.
The very long instruction word (VLIW) architecture, where a single instruction can achieve multiple effects.
Techniques to predict out-of-order execution and prepare subsequent instructions for execution outside the regular sequence.
Consequences in high-level programming
Modern high-level programming languages still follow the sequential-execution model and, indeed, a common way of identifying programming errors is with a “procedure execution” in which the programmer's finger identifies the point of execution as a PC would. The high-level language is essentially the machine language of a virtual machine,[14] too complex to be built as hardware but instead emulated or interpreted by software.
However, new programming models transcend sequential-execution programming:
When writing a multi-threaded program, the programmer may write each thread as a sequence of instructions without specifying the timing of any instruction relative to instructions in other threads.
In event-driven programming, the programmer may write sequences of instructions to respond to events without specifying an overall sequence for the program.
In dataflow programming, the programmer may write each section of a computing pipeline without specifying the timing relative to other sections.
^In a processor where the incrementation precedes the fetch, the PC points to the current instruction being executed. In some processors, the PC points some distance beyond the current instruction; for instance, in the ARM7, the value of PC visible to the programmer points beyond the current instruction and beyond the delay slot.[7]
References
^ abHayes, John P. (1978). Computer Architecture and Organization. McGraw-Hill. p. 245. ISBN0-07-027363-4.
^Harry Katzan (1971), Computer Organization and the System/370, Van Nostrand Reinhold Company, New York, USA, LCCCN 72-153191
^Bates, Martin (2011). "Microcontroller Operation". PIC Microcontrollers. Elsevier. p. 27–44. doi:10.1016/b978-0-08-096911-4.10002-3. ISBN978-0-08-096911-4. Program Counter (PC) is a register that keeps track of the program sequence, by storing the address of the instruction currently being executed. It is automatically loaded with zero when the chip is powered up or reset. As each instruction is executed, PC is incremented (increased by one) to point to the next instruction.
^Arnold, Alfred (2020) [1996, 1989]. "E. Predefined Symbols". Macro Assembler AS – User's Manual. V1.42. Translated by Arnold, Alfred; Hilse, Stefan; Kanthak, Stephan; Sellke, Oliver; De Tomasi, Vittorio. p. Table E.3: Predefined Symbols – Part 3. Archived from the original on 2020-02-28. Retrieved 2020-02-28. 3.2.12. WRAPMODE […] AS will assume that the processor's program counter does not have the full length of 16 bits given by the architecture, but instead a length that is exactly sufficient to address the internal ROM. For example, in case of the AT90S8515, this means 12 bits, corresponding to 4 Kwords or 8 Kbytes. This assumption allows relative branches from the ROM's beginning to the end and vice versa which would result in an out-of-branch error when using strict arithmetics. Here, they work because the carry bits resulting from the target address computation are discarded. […] In case of the abovementioned AT90S8515, this option is even necessary because it is the only way to perform a direct jump through the complete address space […]