A C of woe

Submitted by peter on Tue, 05/29/2018 - 15:55

The C programming language is difficult to learn and has only two redeeming factors. Here are some of the difficulties to consider before starting a project using C. You might find this a better explanation than some of the other documentation out there.

Speed

C is recommended for execution speed. Assembler is faster. Assembler has a different version for each processor architecture, helping you perform everything with the minimum of overhead. C has less intelligence, making C simple enough to work on multiple processor architects. C can then be described as producing the "fastest processing for a language that compiles on most processor architectures".

The cost is very slow software development. Most people will take shortcuts by using a mass of prewritten code libraries which may add excessive overheads. Some other languages can produce examples where they run faster than a C program based on equivalent development time. Some other languages are so fast to write that you can focus heaps of time on optimising the application and the data storage.

There are excellent applications written in higher level languages by people with only a couple of years experience. The equivalent quality in C would require three to five years of experience. As an example, compare C, Java, and a high level language.

In one project there were two developers using the right language for the job and six developers using Java. Java was used because the development manager, a person experienced with C, decided C development was too slow. The two developers made significant improvements two or three times a week. The six Java developers make a significant change only once per week. The whole development process would be more than twice as fast if the Java code was replaced with something modern and more appropriate for the task.

In that example project, the higher level language of choice could run C code for extremely resource hungry processing. You can easily mix C with some higher level languages, retaining C as an important and viable development option.

Popularity

The main reason for using C is popularity. Everything is available in C from code examples to libraries of code with extensive testing. If you attempted to get extra speed by using Assembler, you would end up frequently calling C code for compatibility.

Assembler is more likely to be used for small bits of code that are executed repeatedly. Some C compilers effectively compile to an Assembler level of code then optimise the code based on Assembler level knowledge of the processor and other hardware.

You see optimisation in code that can use graphics processors. There will be one version of the executable code for a graphics processor and another for a computer without a graphics processor. When Intel introduced their first 64 bit processor, AMD introduced a better 64 bit approach. The AMD version won and we now have code compiled for 32 bit in one package and 64 bit in a matching package. The right version is included during compilation or supplied by the operating system.

C code can be made independent of this type of special processor features or C can make use of the special features. At the application level, it is rare to choose something hardware specific. You call a library to do something then the code in the library chooses to use generic code or hardware specific code. Programming in C is too hard for anyone to worry about the fine details of one computer compared to another.

What should you worry about?

Pointers, data storage, presentation, and security are your priority.

Pointers

Pointers are the first roadblock to learning C. Pointers are indicated by an asterisk, *.

Look at the following C code. You can copy the code into a file named pointer.c then compile and test the code.

#include <stdio.h>
int main(int argc, char** argv)
    {
    /* Define and display integers with values. */
    int integer1 = 1;
    printf ("Value of integer1: %d\n", integer1);
    printf ("Address of integer1: %p\n", &integer1);
    return 0;
    }

#include <stdio.h> includes some standard input and output code including printf. This gives us a way to display information about the program when we run the program.

int main(int argc, char** argv) {} is a standard way to define the main part of the program, the only part we need at this stage. return 0; is a standard way to end the execution of the program and is not important for this example.

/* */ creates a comment. Add lots of comments to your code. There are applications that can extract your comments to create documentation.

int integer1 = 1; creates a variable of type integer with a value of 1. int is the type and could be character or a wide range of different types of numbers. integer1 is the name of the variable and could be x or any other mixture of letters and numbers so long as the first character is a letter.

printf ("Value of integer1: %d\n", integer1); displays the value of the variable in the command line box when you run the compiled code. %d displays integers. \n starts a new line to separate out your messages.

printf ("Address of integer1: %p\n", &integer1); displays the address of the variable in memory and may change every time you execute the program. %p displays addresses or pointers. The & at the start of integer1 creates a reference to integer1 and the reference is a pointer containing the address of integer1.

Compile the code using a command like the follow. The exact command may be different in your operating system. Your editor or development environment may give you a button to run the compilation.

gcc -o pointer pointer.c

Execute the program with a command similar to the following command. Again this can vary from operating system to operating system.

./pointer

The output should be similar to the following result.

integer1: 1
Address of integer1: 0x7fff824038a8

We now have one integer, the value of the integer, and the address of the integer. Add the following code to create and display a second integer.

    int integer2 = 2;
    printf ("Value of integer2: %d\n", integer2);
    printf ("Address of integer2: %p\n", &integer2);

Compile and run the code. The output should contain the following extra two lines.

Value of integer2: 2
Address of integer2: 0x7fff336cb55c

integer1 = 1; defines an integer with the value of 1. This line of code is almost the same in every programming language. printf() has equivalents in almost every language. %d has equivalents in almost every language. The combination of %p and & has equivalents in other languages but varies widely across languages.

In C, you can define several sorts of variables containing strings, numbers, and other items. We use only integers in this example. There are lots of useful tutorials and examples for the other basic data types. Pointers are one of the more complicated data types and are rarely explained in a way that can be understood.

Add the following code to pointer.c after the definition of the integers and before the return statement.

int *pointer;
    pointer = &integer1;
    printf ("Address in pointer: %p\n", pointer);
    printf ("Value pointed to by pointer: %d\n", *pointer);
    pointer = &integer2;
    printf ("Address in pointer: %p\n", pointer);
    printf ("Value pointed to by pointer: %d\n", *pointer);

int *pointer defines the dreaded C pointer. The pointer has a type of integer but does not contain an integer. Instead the pointer stores the address of a variable containing an integer. We have to add the address to the pointer with the line pointer = &integer1;.

We use printf to display the value in the pointer, which is an address. For addresses in pointers, printf requires %p instead of %d.

We can use printf to display the value pointed at by the pointer. Note that the printf for the pointer uses *pointer instead of just pointer. *pointer tells printf to use the value from the variable the pointer points to, not the value in the pointer.

The following code is the complete pointer test.

/*
 * Copyright (C) 2018 peter
 *
 */

/* 
 * File:   pointer.c
 * Author: peter
 */

#include <stdio.h>

int main(int argc, char** argv)
    {
    /* Define and display integers with values. */
    int integer1 = 1;
    printf ("Value of integer1: %d\n", integer1);
    printf ("Address of integer1: %p\n", &integer1);
    int integer2 = 2;
    printf ("Value of integer2: %d\n", integer2);
    printf ("Address of integer2: %p\n", &integer2);
    int *pointer;
    pointer = &integer1;
    printf ("Address in pointer: %p\n", pointer);
    printf ("Value pointed to by pointer: %d\n", *pointer);
    pointer = &integer2;
    printf ("Address in pointer: %p\n", pointer);
    printf ("Value pointed to by pointer: %d\n", *pointer);
    return 0;
    }

The following output is the complete pointer compile and execution in a Linux command box. Note the addresses are different in this run of the program.

$ gcc -o pointer pointer.c
$ ./pointer
Value of integer1: 1
Address of integer1: 0x7ffcee32f128
Value of integer2: 2
Address of integer2: 0x7ffcee32f12c
Address in pointer: 0x7ffcee32f128
Value pointed to by pointer: 1
Address in pointer: 0x7ffcee32f12c
Value pointed to by pointer: 2

Data storage

Data is stored in memory or on disk. Disk is very slow compared to memory. When you use disk, the disk access will be the slowest part of your code and the first part to work on for speed. Aim to reduce the number of accesses then the amount of data accessed.

Memory access can also be slow. When you have large amounts of data in memory, you end up with lists of pointers and other slow structures. A redesign of the data might reduce the processing required to access the data.

Presentation

Presenting data and options like input forms can be difficult plus the code can be resource hungry. You usually include a presentation layer library. The library depends on your operating system. You want the minimum overhead to do what is needed.

Do you have a touch screen? The presentation layer interface and protocols can vary between screen/mouse/keyboard combinations and touch screens. A graphics pad is another option. You might have to switch to a completely different user interface layer or include several.

The presentation layer interfaces are usually difficult to the point where you might choose the one option for all your applications instead of trying to use the smallest library for each application. By focusing on one library, you have the chance to work out what is fast and what is slow. Experiment with every option you choose.

Whatever you choose, be prepared to change. Google might make your choice obsolete by releasing better software or new hardware. Have you looked at virtual reality, GPS, and everything else invented after the keyboard/mouse/screen combination?

Security

Data security is more important than performance. When you include a library of code, you want security that matches or exceeds the security required for your data.

Not all data need be secure. An analysis of the data will help you separate the data into secure and public. Things like the time and exchange rates are known to the public from other sources. You do not have to protect that type of data if all you do is display the data. You would have to protect the exchange rate when you use it as input to a calculation. Analyse first.

Encryption is a big resource user. You want reliable tested security. When two code options are both equally secure, you might then test the code for performance. Most other aspects of security require good design and good code but do not make a big impact on performance.

Other languages

There are several other languages that use code with similarities to C. Some do not require pointers or the other complexities of C. If you are not going to study C and extensively test the code, choose a simpler language.

Code performance is not all that important compared to database accesses. Unreliable code will cost you more than slow code. Several of the alternatives increase in speed with each new release. As an example, millions of Web developers chose PHP for speed of development, not speed of execution. With the slower old releases of PHP, database accesses were many times slower. Each release of PHP improved in execution speed. Today the execution speed difference between PHP 7 and C is smaller than the speed difference between two generations of processors. The speed increase from replacing magnetic disks with SSD is many times greater than any speed increase you might get from replacing PHP with C.

Conclusion

C is resource heavy during development and light during program execution. Everything you include can drag down performance. You need to understand the included code in detail or talk with people who have extensive experience with the included code.

Some tricky parts, like pointers, require far more study and testing, compared to other languages. If you do not understand pointers, use a higher level language. Invest in research to avoid the problems. You will then C the possibilities.

Tags