Different C Standards: The Story of C
This article covers the various standards of the C language, from K&R C through ANSI C, C99, C11 and Embedded C.
Ihave always wanted to write about the different standards of C but refrained from doing so for two reasons. The first is that though as an academician I respect and am devoted to C, I thought the industry hardly cared about it. The other more compelling reason is that not all the different standards of C are taken seriously. So, if nobody cares, readers may wonder why am I bothering to write this article.
Two incidents made me change my mind. To write an article about popular programming languages in OSFY, I did a detailed study of programming languages and that included C. I found out that C is still very popular in all the rankings, with supporters in both the academic world and the industry. The other reason is that in a technical interview I had a few months ago, I faced a question based on the C11 standard of C. Even though I absolutely fumbled with that question, I was quite happy and excited. Finally, the times have changed and people are now expected to know the features of C11 and not just ANSI C. So, I believe this is the right time to start preaching about the latest standards of C.
The ANSI C standard was adopted in 1989. In the last 28 years, C has progressed quite a bit and has had three more standards since ANSI C. But remember, ANSI C is not even the first C standard — K&R C holds that distinction. So, there were standards before and after ANSI C. Let’s continue with a discussion of all the five different standards of C — K&R C, ANSI C, C99, C11 and Embedded C. For the purposes of our discussion, the compiler used is the gcc C compiler from the GNU Compiler Collection (GCC).
If there are five different standards for C, which one is the default standard of gcc? The answer is: none of the above. The command info gcc will tell you about the current default standard of gcc, which can also be obtained with the option -std=gnu90. This standard has the whole of ANSI C with some additional GNU-specific features. But there’s no need to panic — gcc gives you the option to specify a particular standard during compilation, except in the case of K&R C. I will mention these options while discussing each standard.
But if there are so many standards for C, who is responsible for creating a new standard? Can a small group of developers propose a new standard for C? The answer is an emphatic ‘No!’. The C standards committee responsible for making the decisions regarding C standardisation is called ISO/IEC JTC 1/SC 22/WG 14. It is a standardisation subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC).
A lot has been written about the history of C and I am not going to repeat it. All I want to say is that C was developed by Dennis Ritchie between 1969 and 1973 at Bell Labs. It was
widely accepted by almost all the professional programmers immediately, and they started making their own additions to it. After a few years, there were a lot of C variants available for a programmer to choose from. My C was different from yours, and nobody could claim that theirs was the real one. So, there was a need for standardisation and the masters themselves have undertaken the task.
The standard called K&R C was actually an informal specification based on the first edition of the extremely popular book, ‘The C Programming Language’, published in 1978 by Brian Kernighan and Dennis Ritchie; hence, the name K&R after Kernighan and Ritchie. But do remember that the second edition of the book that is available nowadays covers mostly ANSI C. So, if you want to learn the details of K&R C from the masters themselves, then you have to buy a used copy of the first edition of this book online.
K&R C not only acted as an informal standards specification for C but also added language features like the new data types long int and unsigned int and the compound assignment operator. A standardised I/O library was also proposed by K&R C. Since the most popular C standard followed by the academia in India is still ANSI C, I will mention a few differences between K&R C and ANSI C. “There are 32 keywords in C,” is one of the clichés I utter in many of my C programming classes. But I often forget to mention the fact that this was not always true. According to K&R C, there are only 28 keywords in C. One keyword was called entry, which was neither implemented at that point in time by any of the compilers nor accepted into the list of keywords in ANSI C.
Another major difference is regarding the function definition. A function definition in K&R C has the parameter types declared on separate lines. Consider the following lines of code showing a function definition in K&R C:
The function fun( ) accepts an integer and a floating-point variable, and returns the product. You can clearly see that the two parameters are declared on separate lines below the name of the function. This is not the style for function definition in ANSI C. The first line of the same function definition will be float fun( int a, float b ) in ANSI C. As mentioned earlier, gcc does not have an option to specify compilation in the K&R C standard. But programs written in K&R C will also be compiled without any errors, because gcc compiler is backward compatible with K&R C.
Even though K&R C was accepted by many programmers as the de facto standard of C, it was not the de jure standard, and nobody could have been coaxed into accepting it as the official standard of C. So, it was absolutely essential for some standards organisation to accept the challenge of coming up with an official standard for C. The American National Standards Institute (ANSI) addressed this issue in 1983 by forming a committee, and the final draft of the standards it formulated was released in 1989. This is the reason why ANSI C is also called C89.
ANSI C is dependent on the POSIX standard. POSIX is the Portable Operating System Interface, a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. The same standard proposed by ANSI was adopted by ISO officially as ISO/IEC 9899:1990 in 1990. This is the reason why ANSI C is also called as ISO C and C90. One standard and four different names — I believe this the reason why this particular standard of C is still very popular. The following five keywords const, enum, signed, void and volatile were added, and the keyword entry was dropped in ANSI C. The option -ansi from gcc compiler will compile programs in the ANSI C standard. The options -std=c89 and -std=c90 will also compile programs in ANSI C. Since gcc is a backward compatible compiler, the above given options will result in successful compilation of programs with K&R C features. Almost all the popular C compilers support ANSI C features. These include compilers like gcc from GCC, Portable C Compiler (PCC), Clang from LLVM, etc. For examples of ANSI C code, open any textbook on C and you will find a lot of them.
C99 is the informal name given to the ISO/IEC 9899:1999 standards specification for C that was adopted in 1999. The C99 standard added five more keywords to ANSI C, and the total number of keywords became 37. The keywords added in C99 are _Bool, _Complex, _Imaginary, inline and restrict. The keyword _Bool is used to declare a new integer type capable of storing 0 and 1. The keywords _Complex and _Imaginary are used to declare complex and imaginary floating point type variables to handle complex numbers. The keyword inline is used to declare inline functions in C, similar to C++ inline functions. The keyword restrict is used to tell the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it will be used to access the object to which it points. New header files like <stdbool.h>, <complex.h>, <tgmath.h>, <inttypes.h>, etc, were also added in C99. A new integer data type called long long int with a minimum size of 8 bytes was added in C99.
In gcc compiler long long int usually takes 8 bytes. The C99 program named fact.c given below finds the factorial of 20. A program using the data type long int with the usual
minimum size of 4 bytes would have given the answer as -2102132736 due to overflow.
The program can be compiled by executing the command gcc std=c99 fact.c or the command gcc fact.c on the terminal. The second command works because, by default, gcc compiler supports long long int. The output of the C99 program is shown in Figure 1.
Features like variable length arrays, better support for IEEE floating point standard, support for C++ style one line comments (//), macros with variable number of arguments, etc, were also added in C99. The official documentation of gcc has this to say: ‘ISO C99. This standard is substantially completely supported’. The legal term ‘substantially complete’ is slightly confusing but I believe it means that the gcc compiler supports almost all the features proposed by the standards documentation of C99. As mentioned earlier, the option -std=c99 of gcc will compile programs in the C99 standard.
C11 is the current and latest standard of the C programming language and, as the name suggests, this standard was adopted in 2011. The formal document describing the C11 standard is called ISO/IEC 9899:2011. With C11, seven more keywords were added to the C programming language, thereby making the total number of keywords, 44. The seven keywords added to C99 are _Alignas,_Alignof, _Atomic, _Generic, _Noreturn, _Static_assert and _Thread_local. Consider the C11 program noreturn.c shown below, which uses the keyword _Noreturn.
Figure 2 shows the output of the program noreturn.c. There are a few warnings because the main( ) has one more line of code after the completion of the function fun( ). So why was the last printf( ) in main( ) not printed? This is due to the difference between a function returning void and a function with the _Noreturn attribute. The keyword void only means that the function does not return any values to the callee function, but when the called function terminates the program, the program counter register makes sure that the execution continues with the callee function. But in case of a function with the _Noreturn attribute, the whole program terminates after the completion of the called function. This is the reason why the statements after the function call of fun( ) didn’t get executed.
The function gets( ) caused a lot of mayhem and so was deprecated in C99, and completely removed in C11. Header files like <stdnoreturn.h> are also added in C11. Support for concurrent programming is another important feature added by the C11 standard. Anonymous structures and unions are also supported in C11. But unlike C99, where the implementation of variable length arrays was mandatory, in C11, this is an optional feature. So even C11-compatible compilers may not have this feature. The official documentation of gcc again says that ‘ISO C11, the 2011 revision of the ISO C standard. This standard is substantially completely supported’. So, let us safely assume that the gcc compiler supports most of the features proposed by the C11 standard documentation. The option -std=c11 of gcc will compile programs in C11 standard. The C Standards Committee has no immediate plans to come up with a new standard for C. So, I believe we will be using C11 for some time.
The standard known as Embedded C is slightly different from all the others. C from K&R C to C11 depicts the changes of a programming language over time, based on user requirements. But the Embedded C standard was proposed to customise the C language in such a way that it can cater to the needs of embedded system programmers. While the other standards of C are improvements over previous standards, Embedded C is a standard that is being developed in parallel. A lot of non-standard features were used in C programs written for embedded systems. Finally, in 2008, the C Standards committee came up with a standardisation document that everyone has to adhere to. Embedded C mostly has the syntax and semantics of normal C with additional features like fixed-point arithmetic, named address spaces, and basic I/O hardware addressing. Two groups of fixed-point data types called the fract types and the accum types were added to the C language to support fixed-point arithmetic. The keywords _Fract and _Accum are used for this purpose. New header files like <iohw.h> have also been proposed by the Embedded C standards committee. The main advantage of the Embedded C standard is that it is simpler and easier to learn than traditional C.
If you want to learn more about any of these standards, just search the Internet, particularly the website http://www.open-std.org/. I will be very happy if this article motivates somebody to migrate from
ANSI C to C11, which was released in 2011. Six years have already elapsed, so it is high time to adopt the
C11 standard. As an aside, the latest standard of C++, informally known as C++17, will be finalised in 2017.
So, we have the chance to be the early adopters of a new standard of a very popular programming language, rather than be one among the late majority, as in the past.
Before winding up, I want to mention the Turbo C compiler from Borland, which is outdated yet used by a lot of people. Which is the standard supported by the Turbo C compiler? Well, just like for the default standard supported by gcc, the answer is also ‘None’. The default standard is the whole of ANSI C and some Turbo C specific extensions. Remember, these extensions are not available with gcc and this is the reason why header files like <conio.h> are not available in Linux. Finally, it took me some time to come up with this pun, “Read this article carefully and you will see that you are not all at sea about C!”
Figure 1: Factorial of 20 in C99
Figure 2: _Noreturn keyword in C11