Introduction
C is general-purpose programming language. It has been closely associated with the UNIX system where is was developed, since both the system and most of the programs that run on it are written in C. The language, however, is not tied to any one operating system or machine; and although it has been called a “system programming language” because it is useful for writing compilers and operating systems, it has been used equally well to write major programs in many different domains.
Many of the important ideas of C stem from the language BCPL, developed by Martin Richards. The influence of BCPL on C proceeded indirectly through the language B, which was written by Ken Thompson in 1970 for the first UNIX system on the DEC PDP-7.
BCPL and B are “typless” languages. By contrast, C provides a variety of data types. The fundamental types are characters, and integers and floating-point numbers of several sizes. In addition, there is a hierarchy of derived data types created with pointers, arrays, structures, and unions. Expressions are formed from operators and operands; any expression, including an assignment or a function call, can be a statement. Pointers provide for machine-independent address arithmetic.
C provides the fundamental control-flow constructions required for well-structured programs: statement grouping, decision making (if-else), selecting one of a set of possible cases (switch), looping with the termination test at the top (while, for) or at the bottom (do), and early loop exit (break).
Functions may return values of basic types, structures, unions, or pointers. Any function may be called recursively. Local variables are typically “automatic”, or created anew with each invocation. Function definitions may not be nested but variables may be declared in a block-structured fashion. The functions of a C program may exist in separate source file that are compiled separately. Variables may be internal to a function, external but known only within a single source file, or visible to the entire program.
A preprocessing step performs macro substitution on program text, inclusion of other source files, and conditional compilation.
C is a relatively “low level” language. This characterization is not pejorative; it simply means that C deals with the same sort of objects that most computers do, namely characters, numbers, and addresses. These may be combined and moved about with the arithmetic and logical operators implemented by real machines.
C provides no operations to deal directly with composite objects such as character strings, sets, lists, or arrays. There are no operations that manipulate an entire array of string, although structures may be copied as a unit. The language does not define any storage allocation facility other than static definition and the stack discipline provided by the local variable of functions; there is no heap or garbage collection. Finally, C itself provides no input/output facilities; there is no READ or WRITE statements, and no built-in file access methods. All of these higher-level mechanisms must be provided by explicitly-called functions. most C implementations have included a reasonable standard collection of such functions.
Similarly, C offers only straightforward, single-thread control flow: tests, loops, grouping, and subprograms, but not multiprogramming, parallel operations, synchronization, or coroutines.
Although the absence of some of these features may seem like a grave deficiency (“You mean I have to call a function to compare two character strings?”), keeping the language down to modest size has real benefits. Since C is relatively small, it can be described in a small space, and leaned quickly. A programmer can reasonably expect to know and understand and indeed regularly use the entire language.
For many years, the definition of C was the reference manual in the first edition of The C Programming Language. In 1983, the ANSI and ISO established a committee to provide a modern, comprehensive definition of C. The resulting definition, the ANSI standard, or “ANSI/ISO C” came out in 1988/1990. Most of the features of the standard are already supported by modern compilers.
The standard is based on the original reference manual. The language is relatively little changed; one the goals of the standard was to make sure that most existing programs would remain valid, or, failing that, that compilers could produce warnings of new behavior.
For most programmers, the most important change is a new syntax for declaring and defining functions. A function declaration can now include a description of the arguments of the function; the function syntax changes to match. This extra information makes it much easier for compilers to detect errors caused by mismatched arguments; in our experience, it is a very useful addition to the language.
There are other small-scale language changes. Structure assignment and enumerations, which had been widely available, are now officially part of the language. Floating-point computations may now be done in single precision. The properties of arithmetic, especially for unsigned types, are clarified. The preprocessor is more elaborate. Most of these changes will have only minor effects on most programs.
A second significant contribution of the standard is the definition of a library to accompany C. It specifies functions for accessing the operating system (for instance, to read and write files), formatted input and output, memory allocation, string manipulation, and the like. A collection of standard headers provides uniform access to declarations of functions and data types. Programs that use this library to interact with a host system are assured of compatible behavior. Most of the library is closely modeled on the “standard I/O library” of the UNIX system. This library was described in the first edition, and has been widely used on other systems as well. Again, most programmers will not see much change.
Because the data types ans control structures provided by C are supported directly by most computers, the run-time library required to implement self-contained programs is tiny. The standard library functions are only called explicitly, so they can be avoided is the are not needed. Most can be written in C, and except for the operating system details they conceal, are themselves portable.
