C pointers explained

Pointers are a very powerful tool in C and similar programming languages. They are special variables that don’t directly contain a value; rather, they “point to” (contain the starting memory address of) the location of a value stored in memory. This “pointed-to” value can be any type — an integer, a floating-point value, a struct, or even another pointer. A pointer, in other words, doesn’t have the information you’re looking for — but it tells you where to go to get that information.

The use of pointers allows the construction of powerful data structures, including linked lists, queues and dequeues, and data trees. The basic idea behind pointers is easy enough, once you understand the concept; it’s usually the syntax that programmers find confusing.

Here is yet another attempt (pointer syntax has been confusing people for decades) to end the confusion and explain, simply and clearly, how to get C pointers to do what they do, including why the special characters (asterisk, ampersand etc) are needed. (I’ll assume you are already familiar with the basics of C programming, including declaring and assigning standard variables etc.)

First, a quick summary for those already familiar with the concept of pointers, but wanting a quick, concise explanation of C pointer syntax. Here is the simplest way of thinking of it that I have come up with:

  • &  means “The memory address of the variable named…”
  • *  means “The contents of the memory location pointed to by…”

For example, “int *x” means “The contents of the memory location pointed to by x is an integer.” Likewise, “mypointer = &y” means “set mypointer equal to the memory address of the variable named y.” Note how the above intuitive definitions for & and * can be just dropped, verbatim, into place. Remove these symbols, and remove the definitions from the explanation, and the examples work in a non-pointer context.

If you’re not already very familiar with both C programming as well as the idea of pointer variables, though, the above explanation won’t be of much help. In that case, a more complete explanation of what is going on is needed. Read on.

Let’s start with a simple example: declaring myval to be an integer equal to three:

int myval = 3;

This is straightforward enough: myval now refers to the value stored in a specific (as-yet-unnamed) memory location. The value stored here is currently equal to three (and is implemented as a signed integer value, probably of 32 bits.) When this variable was declared, the program requested the operating system to allocate space to assign a variable. We, as programmers, don’t (yet) know exactly where in memory this value is stored, however. For basic C programming, it doesn’t matter — but when working with pointers, we might need to know.

Now, suppose we want to know where in memory myval is stored. (For now, trust me that this is a useful thing to know.) We create a “pointer” variable, which doesn’t itself hold data, but which holds the number of a memory location (ostensibly containing our data or something else of interest.)

int *myval_pointer;

This line creates a new “pointer variable” called myval_pointer. (It doesn’t have to have “pointer” in the name — that’s just to help us remember what it is, for now. I could have called it mypointer, testpointer, or Fred, for all the compiler cares.) This new variable is set up to hold a memory location. The “int” part tells the compiler that when we use this pointer to look up the contents of a memory location, we intend for the raw data there (bytes) to be interpreted as a signed integer.

Right now, though, this new pointer doesn’t yet point to anywhere useful. Depending on how the compiler is implemented, it will either be equal to zero or will contain a random value. (Remember, always initialize your variables yourself!) Let’s put this new pointer variable to use, and have it point to the location in memory where myval is stored. (We don’t know where this is — but the compiler does!)

myval_pointer = &myval;

This statement sets the value of myval_pointer to the address of myval (some large number, perhaps in the billions on a system with an address space size in the multi-gigabyte range.) The = is the usual assignment operator, and the & symbol stands for “the memory address of.” So now, myval_pointer does indeed point to the address of myval. (Remember, this is because we assigned it this way — not because of how it’s named.)

Now, let’s see what this new way of accessing memory can do.

*myval_pointer = *myval_pointer + 1;

This statement increments the value in the memory location pointed to by myval_pointer by one. (The * symbol can be thought of as meaning “the contents of the memory location pointed to by”) Since this memory location is the one used by myval, what we’ve done is really just increment the value of myval directly in memory, without referring to it by name. If we were to print out the value of myval now, it would be 4. Compare the above line of code to the following:

myval_pointer = myval_pointer + 1;

You might think that this would increment the value of myval_pointer by one, making it point to a location one byte higher in memory. This actually isn’t the case, though — the compiler takes it upon itself to increment the value by four, since that’s the size of the int value that it was declared to point to. This statement doesn’t affect the value of myval in any way. What it does is to make myval_pointer point to the next memory address above where myval is located. (This can be very useful when going through an array of variables, for instance.)

Here is a quick example program showing some of the ways that pointer-variable syntax works. Try making your own modifications to see what happens. I recommend compiling it with gcc for Linux, in a regular user (I.E. non-root) account.

//Basic C pointer operation examples
//M. Eric Carr / Paleotechnologist.net

#include <stdio.h>

int main(){

//Declare a simple integer variable
int myval = 3;

//Declare a pointer-to-an-integer
int* myval_pointer;

//Assign the address of myval to myval_pointer
myval_pointer = &myval;

//Show the initial values of the variables.
printf (“myval is %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);

//This increments the value and does not move the pointer.        *myval_pointer = *myval_pointer + 1;
printf (“myval is now %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);

//This moves the pointer up by four (32 bits; one int).
//The value in the original location does not change.
*myval_pointer++;
printf (“myval is now %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);               *myval_pointer–;   //Undo this change.

//This also moves the pointer up by four (32 bits).
*(myval_pointer)++;
printf (“myval is now %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);
*myval_pointer–;  //Undo this, too.

//This increments the pointed-to value.
//(It’s unintuitive that ++ would have higher priority
// than the pointer dereferencing operator *, but
// there you have it.)
(*myval_pointer)++;
printf (“myval is now %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);

//What happens when we increment the pointer by one?
myval_pointer = myval_pointer + 1;
printf (“myval is now %d.\n”,myval);
printf (“myval_pointer is %#llX.\n\n”,myval_pointer);
*myval_pointer–;  //Undo this, too.

return(0);
}

…So what are pointers good for? What can they do? That’s actually quite an in-depth topic, but one of the most useful features of pointers is that they can be used to create “linked lists” and related data structures (trees, queues, and many more).

Unlike an array, which has to be allocated as a block of memory before it is used, elements can be efficiently added to, removed from, and moved around within a linked list. Instead of the “box of pigeonholes” metaphor of arrays, linked lists can be thought of as links in a chain. More links can be added, links can be removed from either end or anywhere in the middle, etc. With more advanced data structures, more complex structures can be created.

The way a simple linked list works is by setting up a custom data type. Whereas a simple data type would either contain a numerical value, a character, or perhaps a memory location (if it’s a pointer), this custom type would contain one or more pieces of data (the “payload,” and a pointer to the same custom data type.

This sounds unintuitive, until you realize that the addition of the pointer allows each data element to point to the next one in the chain. By maintaining a single pointer which points to the start of the list, a program can traverse the list, looking for a desired record, adding up totals, or whatever other operations are useful.

By convention, the pointer of the final element in the list is set to the value NULL, meaning that it doesn’t point to any memory location. If well written, code that examines the linked list by traversing it from start to finish is designed to check for this special NULL value, and stop processing when it reaches that point.


Posted in C, Coding, HOW-TO | Tagged , , , , , , , , | 4 Comments

Arduino loop timings

The advent of easy-to-use development ecosystems like Arduino have made a lot of embedded design tasks, such as obtaining GPS positions or controlling LCD displays or servo motors, significantly easier. Tasks which would take many hours to implement in assembly (a square root function for distance calculations, for example) are easily implemented in a single line of C or C++ code.

Often, though, there is a performance penalty associated with blindly using C or C++ code and making calls to library functions. If these functions are not used as intended, a significant amount of processor time can be wasted in unnecessary housekeeping.

While investigating the possibility of migrating our EET401 Microcontrollers course to the Arduino platform, one of the professors with whom I work ran some quick tests on an Arduino Uno, to test its clock-cycle efficiency. The results he got were startling, and warranted further investigation. Here is a recreation of these experiments, along with an brief explanation of what is going on.

The Arduino development environment comes with the basic example code to blink a LED:

void setup() {
pinMode(13, OUTPUT);
}

void loop() {
digitalWrite(13, HIGH);   // set the LED on
delay(1000);                     // wait for a second
digitalWrite(13, LOW);    // set the LED off
delay(1000);                     // wait for a second
}

This results in the LED (connected to Pin 13 on the Arduino Uno) blinking on and off at a rate of right about half a Hertz (one second on, one second off). The code is straightforward and easy to understand — and for an application like this, performance isn’t an issue.

Here’s the same sketch, with the delays removed. Intuitively, this should turn the pins on and off as fast as possible, since the loop appears to be doing nothing else.

void setup() {
pinMode(13, OUTPUT);
}

void loop() {
digitalWrite(13, HIGH);   // set the LED on
digitalWrite(13, LOW);    // set the LED off
}

Intuition can sometimes be deceiving, though. This code results in a square wave of only about 121kHz. Since the AVR microcontroller on the Arduino runs at a speed of 16MHz, this represents about 133 system clocks per cycle, just to turn one bit on and off. What’s going on?

Only 121kHz, running on a 16MHz system clock? (Click for larger.)

As it turns out, the calls to digitalwrite() are responsible for much of the delay. This routine is actually fairly efficient at what it is intended to do, but is far too general to be good at high-speed operations like this. It accepts a variable input, chooses which pin to change, then looks up the correct memory address and makes the change. All of this is accomplished in about twenty or so clock cycles, which isn’t bad when you think about it.

Such fancy options aren’t necessary when going for pure speed, though, so in this case there are better options. Re-writing the program to replace the calls to digitalWrite() with Boolean functions that write directly to the output port improves the frequency to 1.14 MHz. This is nearly a 10x improvement — but the short 14% duty cycle implies that there is still quite a bit of optimization that could be done in the loop itself.

After replacing DigitalWrite() with direct writes: 1.139MHz. (Click for larger)

Using a while(1) loop to surround the port-on and port-off statements eliminates most of the remaining delay, improving the frequency to 2.66MHz, with a final duty cycle of ~33.5%. 2.66MHz represents 1/6 of the input clock frequency, so apparently each operation (bit-on, bit-off, and loop) takes two clock cycles. This is probably optimal, and is better than would be possible in PIC assembly at 16MHz (four clock cycles would be needed for each bit operation, and eight for the jump.)

Adding an explicit loop gets us 1/3 duty cycle @ 2.66MHz. (Click for larger.)

Here is the final code used to get the 2.66MHz signal shown above:

void setup() {
pinMode(13, OUTPUT);
}

void loop() {
while(1){
PORTB |= 0x20;   // set the LED on
PORTB &= ~0x20;    // set the LED off
}
}

In conclusion, compiled C code can indeed be as efficient as handwritten assembly code — but it’s important to know the overhead associated with calls to library functions. The Arduino environment was built for ease of use, not lightning speed. Considering everything that functions like digitalWrite() do, though — addressing pins based on a variable, setting PWM states correctly etc — the efficiency of these functions is actually pretty good. It’s a question of using the right tool for the job.

 

Posted in Arduino, C, Coding, Digital | Tagged , , , , , , | Leave a comment

Symantec “LU1803” fix

Since the computers in the PLC lab at work support a rather old set of Allen-Bradley PLCs via an old 1747-PIC interface, they are limited to running Windows XP Service Pack 2. This solution works well enough, but is definitely showing its age. Supporting legacy installations like this sometimes require a bit of research when mysterious error messages show up.

Recently, while updating the Symantec anti-virus software, I encountered a “LU1803: LiveUpdate failed while getting your updates” error. Since a 2006 virus database is of very little use in late 2011 (and being limited to running XP/SP2 rules out running MS Security Essentials as I usually recommend), something had to be done.

As it turns out, there’s a fix — follow this link (which requires Internet Explorer; sorry) and run the auto-fix tools provided by Symantec.  Kod feci. Arne Saknussemm.

Share and enjoy!

Posted in System Administration | Tagged , , , , , | Leave a comment

Cabbage Chunkin’, Skyrim style!

One of the best parts of TES V: Skyrim is how “open” the world is. There are the usual quests and adventures that you would expect, but there are many other ways to simply wander around and explore the world that Bethesda has created. In-game physics continues to improve, as well, allowing all kinds of interesting experiments and in-world “hobbies” for your characters to pursue.

As it is autumn in the northern hemisphere, I decided to see if a Skyrim version of the annual “Punkin’ Chunkin‘” contest would be feasible. I was disappointed to find that apparently pumpkins aren’t present in Skyrim, even though they were in TES IV: Oblivion. (Anyone else remember making “pumpermelon” health potions?)

However, cabbages do exist in Skyrim, and such common foods are quite plentiful (especially if you channel your inner Sheogorath and make it rain cabbages. Or cheese.) Since cabbages, like nearly everything else in the game, are physics-enabled, they react to environmental forces, including gravity, momentum, collisions — and the occasional otherworldly FUS RO DAH.

So without further ado, here is my Altmer mage diligently laying the foundation for Skyrim’s own Punkin’ Cabbage Chunkin’ contest…

Posted in Games | Tagged , , | Leave a comment