Jul 84 min read

A Deep-Dive into Pointers: Everything You Should Know as a Programmer

Introduction

Picking up where we left off in my introduction to pointers, this guide will offer a more in-depth explanation of the most infamous topic in C.

To summarize, a pointer:

stores a memory address as its value
uses the same data type as the data located at the address it holds
allows variables and data structures to be passed by reference, not by value
removes complexity
- e.g. instead of passing a copy of an array into a function and duplicating the modified array returned by the function, simply modifying the array directly is more efficient

NULL Pointer

This pointer points to absolutely nothing. It's always important to set the value of a pointer to NULL when it's created and isn't assigned another value in order to prevent memory vulnerabilities.

Setting it to NULL clarifies to the program that it is indeed not pointing to anything. Simply declaring a pointer without a value like so...

int* ptr;

... doesn't default ptr to NULL, and could therefore lead to memory vulnerabilities.

NULL can also be useful when confirming if a pointer isn't pointing to anything by using the == operator:

int* ptr = NULL;
if (ptr == NULL)
{
	return 1;
}
return 0;

Extracting Memory Addresses From Variables

Another way to "fill" a pointer is to use the (&) operator coupled with another variable. For example:

int num = 9;
int* ptr = &num; //ptr now stores the hexadecimal address of the variable num

Similarly, with arrays:

double arr[4] = { 1.0, 2.4, 6.3, 3.0 };
double* ptr0 = &arr[0]; // stores the address of first element in arr
double* ptr = arr; // also stores the first element in arr
double* ptr1 = &arr[1]; //stores the address of the second element in arr

This sort of explains why arrays are by default passed by reference: the name of the array itself is just a pointer to its first element.

That's why when an array is passed through a function, the values at that memory address are being directly modified, and thus, the array itself is changing.

Dereference Operator

So here's where it gets slightly more confusing. Previously, we used * to initialize a pointer variable like such:

float x = 3.22;
float* ptr = &x;

In a different context, the * operator is also called the dereference operator. All it really allows us to do is to go into a pointer and access the data at that memory location.

Let's revisit the above example. Let's say I want to change the value of x to 6.54. Of course, I could simply reassign x to that value, but I can also "dereference" ptr, going into the memory address of x, and changing the value stored there as such:

float x = 3.22;
// creating a pointer variable
float* ptr = &x;
// dereferencing
*ptr = 6.54;

So what happens if we try to dereference a pointer with value NULL? In this case, we would suffer a segmentation fault, meaning we touched memory we shouldn't have.

Counterintuitively, this is actually good for us!

If we hadn't ensured that our "undefined" pointer was NULL, it would likely store a random memory address of some number of bytes allocated for our variable (depending on the type). Oftentimes, these bytes of memory have pre-existing junk values that could potentially store sensitive information.

So dereferencing a non-NULL value might lead to accidental modifications of memory that likely should have been untouched. Further, without an error message to stop your program, it may even prove harder to debug.

Declaring Multiple Pointers

You may recall that in C we can declare multiple variables at once using commas.

char x, y, z;

Well, intuitively, to declare multiple pointers you would simply replace char with char*, right?

Annoyingly enough, that would just declare x as a pointer, while y and z remain normal chars. To fix this, you would have to place a * operator behind each variable:

char* x, *y, *z;

Also, where the * is placed is up to the programmer's discretion. For instance:

// both of these are valid declarations
int* x = NULL;
int *x = NULL;

Some programmers prefer to place emphasis on the fact that int* is the type, while others prefer uniformity when, for instance, declaring multiple pointers in a single line.

Strings and Pointers

In C, of course, the string data type doesn't exist. Instead, we use char*, and now we have a pretty decent idea of why. A string, after all, is just an array of chars!

Therefore, the name of our char* variable is just a pointer to the zeroth element in the string.

Also, revisiting the chart in my intro to pointers, we can now confidently say that a string or char* uses either 4 or 8 bytes (depending on the system size). In fact, every pointer, whether int* or float*, uses either 4 or 8 bytes because they hold the same data: a memory address.

Pointer Arithmetic

One final concept. Remember how we established that arrays are special in that they store data consecutively, exactly one byte apart? With this, we can perform arithmetic on strings, which are really just char arrays.

#include <stdio.h>

int main(void)
{
	char* string = "Bye";
	printf("%c\n", *string);
	printf("%c\n", *(string + 1));
	printf("%c\n", *(string + 2));
}

Because we established that the name of an array is a pointer to its first element, we can dereference that variable to get the value within. Similarly, we can add one to the address to get the next byte, and then dereference that, as well.

We can even find the length of a string as follows:

Final Thoughts

Don't worry if you don't quite get pointers from the get-go, as they get easier with time. Just know that it's definitely worth mastering pointers for the freedom and efficiency they can bring to your program by accessing values directly from memory.

Thanks for reading!

A Deep-Dive into Pointers: Everything You Should Know as a Programmer

Recent Posts

Comments

I value your feedback.
Drop a line to let me know what you think.

hello@sabirseth.com