The This Pointer in C++

Still on my fundamentals trip, I’m hitting up the ‘this’ pointer. Every class that you create has a ‘this’ pointer invisibly assigned to it by the compiler. Let’s look at a simple class to see what’s going on:

When you write the above code the compiler does some fun things with it, such as invisibly adding four methods:

  • A default constructor (that takes no parameters) which is automatically executed when you instantiate an object of this type,
  • A destructor (again no parameters) which is automatically executed when an object of this type is deleted or goes out of scope,
  • A copy constructor (that takes another object of this type) and performs a shallow copy from the source object to the (new) destination object, and
  • An assignment operator (that takes another object of this type) and which again performs a shallow copy from that object the the object you’re assigning to.

If we explicitly write these four methods into our class, we end up with our (exactly, exactly equivalent) code now being:

You can substitute either of these classes into a project, compile it (in Release mode if you have both in there and just comment each out in turn!), and you’ll end up with byte-wise identical executables down to the very last bit. Not only are they functionally equivalent, they’re absolutely equivalent – as the compiler sees them, it’s the exact same code. Don’t take my word for it – try it out, if you’d like!

The ‘this’ pointer’s already being used, but what exactly is it doing? Well, let’s drill down into the nuts and bolts of it and take a look…

What is this?

As I’ve kind of been forced to jump ahead of myself here (because the copy constructor and assignment operator we wrote used the this pointer already) – let’s cut to the chase and ask question: what is this?

It turns out that ‘this’ is a pointer (i.e. a memory address) of the current object. Every object must exist in memory, and for us to do anything useful with the object, we need to know where in memory it is. And that is where ‘this’ comes into things.

We wrote some simple setter and getters earlier, using code as follows:

However, when the compiler is making this into an executable, it does some further injecting so that our code can function correctly. What it does is to modify our code so that it becomes:

The code “this->a” (with -> being the arrow operator which de-references a member/property of a class) takes the this pointer (which as we said, is the memory address of the object) and then as the class knows the memory address of the various methods, as well as the offsets of all properties of the class, proceeds to use that knowledge to go to the memory address of the a member (in this example). Once we have the memory address for a specific property of a specific object, we can “set” or “get” it by simply reading or writing however many bytes (depending on the size of the property – an int is generally 4 bytes) at that memory address.

Memory layout

When we create a class, we can look at the memory size of each instance of the class, as well as the offsets of class properties using the sizeof and offsetof functions.

Let’s give this a go by recreating our Example class to give it a few extra data members, and then print out the size of an Example instance and the offset of any properties:

The output of which is:

So 1 + 12 + 8 = … um, 21 bytes… so why is the size of each instance 24 bytes. Again – more complexity, this time it’s byte alignment – but at this stage (and in this article) we’re not too fussy about the hows and whys, so we’ll just say that the compiler pads the memory allocation so that the memory for our class properties aligns cleanly. You can think of it as the compiler’s way of justifying the data if you like:

ByteAlignment2

This is easy

As a final demonstration of the difference between this (the memory address of an object instance) and *this (the actual object at that memory address), we’ll do one final run-through of how we can use the ‘this’ pointer in our own methods, if we so choose:

Wrapup

The very final ‘loophole’ that was bothering we was; as we have a copy constructor, what happens when we do this?:

That is, we instantiate an instance of a class using the copy constructor with a copy of itself – which is currently un-instantiated! The answer, disappointingly, is not a lot. It’s legal, it works – and nothing bad happens. In our Example class, the properties remain undefined/uninitialised, and if you do the same thing with an int, via:

Then you get the same behaviour – try printing ‘foo’ and you get whatever garbage was already in the memory at that location. If you had a serious class, and especially if you had one that allocated memory on the heap then you’d put a self-assignment guard in there to gracefully deal with such shenanigans – which we’ll look at in the next post, which’ll be on creating templated resizable arrays, if you’re up for it ;-)

Cheers!

Leave a Reply

Your email address will not be published.