Thursday, July 26, 2012

C++/CLI and Heap Compaction

I'm currently looking into C++/CLI that is a C++ extension developed by Microsoft which allows for writing managed code with C++ syntax.  Technically this is not an extension since there has been several new key words added to the language and these do not follow the C++ specification of how new keywords should be formatted. There was an attempt at creating a C++ extension, called managed C++, that followed the  guide lines; however the syntax became complicated and ugly and the extension was recreated as the programming language C++/CLI which was designed to produce cleaner and prettier code.

The C++/CLI language can give good insight into what happens within the CLR (Common Language Runtime). C++/CLI also allows for great managed to unmanged (that is code that does not have automatic memory handling) transitions, as well as the other way around. The reason for this is that it allows handling of both native C++ object and types as well as managed .NET objects and types.

There is a lot to write about this subject and hopefully I will produce more blog posts about this topic however in this post I will just talk a bit about the difference in managed and unmanaged objects are handled.

When an object is created in C++ using the new key word memory will be allocated on the heap and a address to the objects location on the heap will be returned. This address is the pointer that the coder must keep track of to use the object and finally call delete on when the object no longer is desired. When delete is called on an object the memory is freed up and can be used by other objects.

Managed object are always added to the managed heap when they are created. When they no longer are used they are collected by the automated GC (Garbage Collector). A big difference between the managed and unmanaged heap is that the managed heap also is compacted. This means that when the GC is running the memory addresses of objects on the heap are moved to create a continuous block of memory. 

Heap Compaction
Managed Heap Compaction
Doing this reduces the risk of heap fragmentation which can lead to out of memory errors even though there is non allocated memory left on the heap. The problem with this is that since the memory addresses are moved around you can no longer use pointers to keep track of your objects. In the managed C++/CLI (and all languages that use Microsoft's CLR) this is done by Tracking Handles, or handles for short. In most of the .NET languages this is handled behind the scenes but in C++/CLI you have to declare your handle by using a caveat, ^, which is similar to the asterisk, *, used by C++ unmanaged allocations.  The downside is that when the heap is compacting it must update all tracking handles with the new memory addresses of the allocated objects. This is one of the reasons why an unmanaged language such as C or C++ will always have a slight performance advantage over a managed language.  In many cases simplicity in the development process outweighs the need for high performance but selecting the right tool for a job is not a judgment call that should be taken lightly. 

Wednesday, July 11, 2012

Member Functions



In my daily work I usually use C# however there is some interactions with C++ in the legacy code. Recently a change woke an dormant bug which took a while to track down due to some misdirection.

A simplified but similar error is displayed in this code.
 #include <iostream>  
 #include <list>  
 using namespace std;  
 class A {  
 private:  
      list<int> IntegerList;  
 public:  
      int getListCount() {  
           return IntegerList.size();  
      }  
 };  
 class B {  
 public:  
      A* AClass;  
 };  
 int main() {  
      B* BClass = new B();  
      if (BClass) {  
           cout << BClass->AClass->getListCount() << endl;  
      }  
      return 0;  
 }  

Member function of classes are about the same thing as usual free functions, they just hide that they require a this pointer to the object calling them as their first argument. This pointer can be null which means that the code does not crash when you call a member function on an uninitialized class instance. The crash first occurs when you try to work with the non-existing memory that you think your class contains.

You always want to crash early when a problem occurs, this simplifies the location of the problem. In the code above the crash does not occur until we try to get the size of the integer list. This problem could have easily been avoided with some defensive programming and RIIA.

In the code below I have made sure that the AClass gets assigned when the BClass gets created and even though I'm pretty certain AClass now exists an assert will make sure we crash at a more logical position in the code.  
 #include <iostream>  
 #include <list>  
 #include <cassert>  
 using namespace std;  
 class A {  
 private:  
      list<int> IntegerList;  
 public:  
      int getListCount() {  
           return IntegerList.size();  
      }  
 };  
 class B {  
 public:  
      B() : AClass(new A()) {}  
      virtual ~B() { delete AClass; }  
      A* AClass;  
 };  
 int main() {  
      B* BClass = new B();  
      if (BClass) {  
           assert(BClass->AClass != NULL);  
           cout << BClass->AClass->getListCount() << endl;  
      }  
      return 0;  
 }  


Tuesday, July 10, 2012

Greetings


I started programming in my teens, my first experience was copying an AMOS program from a computer magazine article. Now, a couple of decades later, I now work as a programmer for a Swedish telecommunications company.

I'm not a guru at any programming language or technique, this blog is not meant to be a shining beacon on the sea of bad code that is out there. This is a place for me to put down thoughts about the things that I come a cross, forcing me to think about them more closely. However, if my way of explaining techniques, useful components, or similar, is of benefit to some one else I wouldn't consider that a bad thing.

I will also use this blog to went some of my personal opinions which is perhaps not ideal for a technical blog but I would never have the discipline to continuously update two blogs. This means that future post might contain ranting on topics that have nothing to do with programming so consider this a fair warning.