No Bullshit & Programming: August 2012

Monday, August 13, 2012

C++/CLI compile options

When compiling C++/CLI code there is several different compiler modes which sets the level of interaction between managed and unmanaged code. Depending the on what is should be achieved with the code the correct compiler option can be the difference between a finished component and the compiler throwing a tantrum.

This post will deal with the options /clr, /clr:pure, and /clr:safe. There is also the option /clr:oldSyntax which is used for compiling code that uses the managed C++ extension . The managed C++ extension is a predecessor to C++/CLI and uses a completely different syntax.

clr:safe

This option requires that only verifiably type safe code is included in the component. This means that the code cannot access unsafe arrays, that does not carry out boundary checks, or any unsafe pointers. Any calls to native code must therefore be marshaled and the code cannot contain any instances of native types.

1:  #pragma once  
2:  using namespace System;  
3:  namespace ClrSafe {  
4:       public ref class ClrSafeClass  
5:       {  
6:       private:  
7:            double offset;  
8:       public:  
9:            ClrSafeClass(double offsetParam) : offset(offsetParam) {}  
10:            ~ClrSafeClass() {}  
11:            double Translate(double position) { return position + offset; }  
12:       };  
13:  }

The code above does not contain any references to native code and can hence be compiled with the /clr:safe option. In comparison to C# this would be similar to compile with the Any CPU flag, this is because the code does not reference any specific architecture.

clr:pure

The option like the /clr:safe option generate only IL (Intermediate Language) output. The difference is that native types and classes are allowed to exist within the component. The compiler makes it so that the calls to native calls are transformed to IL. There are some limitations to this, such as that the native components cannot be declared to export their methods from the DLL. Neither __delcspec(dllexport) nor .def files will export native calls, this is due to that all exported methods are internally declared with __clrcall.

1:  #pragma once  
2:  #include <cmath>  
3:  #include <iostream>  
4:  #include <msclr\marshal_cppstd.h>  
5:  using namespace System;  
6:  using namespace msclr::interop;  
7:  namespace ClrPure {  
8:       public ref class PureClass  
9:       {  
10:       private:  
11:            double exponent;  
12:       public:  
13:            PureClass(double exponentParam) : exponent(exponentParam) {   
14:                 std::string mess = "Using exponential";   
15:                 Console::WriteLine(marshal_as<String^>(mess));   
16:            }  
17:            ~PureClass() {}  
18:            double Power(double base) { return pow(base, exponent); }  
19:       };  
20:  }

The above code can be compiled with /clr:pure and the Power method, on row 18, that uses a native function in the cmath library generates the following IL code:

1:  .method public hidebysig instance float64   
2:      Power(float64 base) cil managed  
3:  {  
4:   // Code size    15 (0xf)  
5:   .maxstack 2  
6:   .locals ([0] float64 V_0)  
7:   IL_0000: ldarg.1  
8:   IL_0001: ldarg.0  
9:   IL_0002: ldfld   float64 ClrPure.PureClass::exponent  
10:   IL_0007: call    float64 modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) pow(float64,  
11:                                                    float64)  
12:   IL_000c: stloc.0  
13:   IL_000d: ldloc.0  
14:   IL_000e: ret  
15:  } // end of method PureClass::Power

On row 10 there is a call to the native method expressed in IL: Code compiled with /clr:pure or /clr:safe have native entry points and are slightly more effective since they both produce a none mixed assembly.

clr

The /clr option allows for complete mixing of unmanaged and managed types. This type of assembly has a mixture of IL and native code. The entry point of the DLL is native code that then loads the CLR when it is needed.

1:  #pragma once  
2:  using namespace System;  
3:  namespace Clr {  
4:       class __declspec(dllexport) NativeClass  
5:       {  
6:       private:  
7:            double scaleFactor;  
8:       public:   
9:            NativeClass(double scaleFactor) : scaleFactor(scaleFactor) {}  
10:            virtual ~NativeClass() {}  
11:            double scale(double factor) { return factor*scaleFactor; }  
12:       };  
13:       public ref class ClrClass {  
14:       private:  
15:            NativeClass* nativeClass;  
16:       public:  
17:            ClrClass(double scaleFactor) { nativeClass = new NativeClass(scaleFactor); }  
18:            ~ClrClass() { delete nativeClass; }  
19:            double Scale(double factor) { return nativeClass->scale(factor); }  
20:       };  
21:  }

The above code has both the possibility to be called by a managed or unmanaged component, where the managed class is merely a wrapper for the native class.

This concludes the C++/CLI compiler options that can be used to create mixed modes and managed DLL’s. A MSDN article on the compiler options can be found here.

Friday, August 10, 2012

Should private data and methods be unit tested?

A central part in OOP (Object Oriented Programming) model is encapsulation which hinders the external usage of some of the object’s components, such as its methods and data. Some OOP languages do not have real encapsulation, Python is an example of this but there is a loose agreement that internal data that begins with two underscores should not be accessed directly.

When it comes to unit testing such objects it can be quite tempting to check that the accessible methods modify the private data in a manner that is according to how the class should work. There are usually ways to achieve this in the programming language without breaking the encapsulation principle totally.

In C# it is possible to mark components as public, protected, private, and internal. protected components of an object are only available for inheritance while private components are only available to the object which they have been declared in. Components of an object that are declared as internal are only available within the software module, however it is possible to allow other modules access to them by adding an attribute to the AssemblyInfo.cs file.

For example if we have a class named AClass that has internal members and we want to have access to them in the unit test project called AClassUnitTest the following line should be added to the AssemblyInfo.cs file which houses the AClass file:

[assembly: InternalsVisibleTo("AClassUnitTest")]

Declaring an internal constructor which allows for injecting mocks into the class that should be tested is good usage of this functionality. The private and protected components can also be accessed through what is called Accessors. There is an article on MSDN on using Accessors on MSDN

In C++ there is the friend keyword that can be used to access private components of another object.

1:  #include <iostream>  
2:  using namespace std;  
3:  // Predeclaration for AClass so the BClass is aware of it existance.  
4:  class AClass;   
5:  class BClass {  
6:  private:  
7:       int data;  
8:       void printData() { cout << data << endl; }  
9:  public:  
10:       BClass (int insertData) : data(insertData) {}  
11:       friend AClass;  
12:  };  
13:  class AClass {  
14:  private:  
15:       BClass* dataClass;  
16:  public:  
17:       AClass (BClass *insertDataClass) : dataClass(insertDataClass) {}  
18:       virtual ~AClass() { delete dataClass; }  
19:       void fireDataClass() {   
20:            if (dataClass)  
21:                 dataClass->printData(); // Calling the private method on BClass  
22:       }  
23:  };  
24:  int main(int argc, char* argv[]) {  
25:       AClass friendClass(new BClass(10));  
26:       friendClass.fireDataClass();  
27:       return 0;  
28:  }

In the example above the AClass can access both the private function and data of the BClass which is due to that the BClass friends AClass on row 11. Simply by pre-declaring a unit test class it is possible to prepare a class so that the internal members are available for unit testing. A more in depth description of the friend keyword and its usage can be found here

However testing private data with unit test makes the black box, that objects usually are, transparent. Changing the internal functionality of the objet will then carry a high probability of breaking the unit test even though the functional contract of the object has not been violated. There are good grounds to treat objects as black boxes even during the testing of the objects.

Unit test should only verify the externally visible components of an object so it is possible to safely refactor the code of the object without destroying its usability. Rewriting of a test should only occur when the object has had a breaking change to its interface and this should indeed be a time for reflection on what has been done.

Thursday, August 9, 2012

Unit Testing and Native Code

Old legacy code often comes with a technical debt. Even if the code currently works correctly feature add-ons and bug corrections can cause breaking changes. Rewriting large chunks of legacy code to C# or some other .NET language can be quite expensive. A good solution is to add unit test which increases reduces the risks when altering the code.

There are many good unit testing framework and in this article we will be looking at using Microsoft's framework to test native code. Microsoft's unit testing framework is based in .NET but by using C++/CLI it can be used to test native (unmanaged) code.

I have used visual studio and selected to create an empty Visual C++ project. To the project I have added one class called AClass which has two private types and a couple of functions described in the header file.

AClass.h

1:  #pragma once  
2:  #include <iostream>  
3:  using namespace std;  
4:  class __declspec(dllexport) AClass  
5:  {  
6:  private:  
7:        string name;  
8:        double test;  
9:  public:  
10:       AClass(void) : name(""), test(0.0) {}  
11:       virtual ~AClass(void) {}  
12:        bool setName(const string&);  
13:        bool getName(string&) const;  
14:        void setDouble(double);  
15:        double getDouble(void);  
16:  };

The implementation is of the class is pretty much straight forward. Setting a new name in the DLL is always successful however getting the name when no name has been set returns false.

AClass.cpp

1:  #include "AClass.h"  
2:  bool AClass::setName(const string &name) {  
3:       this->name = name;  
4:       return true;  
5:  }  
6:  bool AClass::getName(string &name) const {  
7:       if (this->name.empty())  
8:            return false;  
9:       name = this->name;  
10:       return true;  
11:  }  
12:  void AClass::setDouble(double value) {  
13:       this->test = value;  
14:  }  
15:  double AClass::getDouble(void) {  
16:       return this->test;  
17:  }

To create a unit test for this DLL I created a new Visual C++ project in Visual Studio. In the Visual C++ menu the CLI sub-menu contains a template for creating unit tests. In the code below i have removed some of the auto generated code, which was unnecessary for this example, to make the example more readable.

To make the unit test class find the AClass.h header file so that we can make instances of the class the project must be told of the location of the header file. Setting this is done by selecting Properties for the AClassUnitTest project. A path to the header file should be add In the C/C++ menu to the Additional Include Directories property.

To use the routines in the DLL it must be referenced. There are two options for this:

If the unit test project is in the same solution file as the DLL it is possible to open the Properties from the project menu and expand the Common Properties node. By selecting References and clicking Add New Reference... it is possible to reference the project for the DLL.
By opening the Properties from the project menu and Expanding the Linker node. Under the Input node the name of the .lib file, which was generated when the DLL was compiled, should be added to the Additional Dependencies property. In this example the name is AClass.lib. The path to the LIB file should be added to the Additional Dependencies property under the General node that is also is in the Linker category. The path to the LIB file is the same as the compile directory for the DLL, which most likely is not the same directory as the output directory where the final DLL is located.

A good description of creating and using DLL's can be found on MSDN.

For those who are used to creating unit tests in C# all the same functions are available under C++/CLI. However the referencing of static methods in class are as in C++ syntax with the use of "::" instead of a ".". By using asserts the functionality of the class is verified.

AClassUnitTest.cpp

1:  #include "stdafx.h"  
2:  #include "AClass.h"  
3:  using namespace System;  
4:  using namespace System::Text;  
5:  using namespace System::Collections::Generic;  
6:  using namespace Microsoft::VisualStudio::TestTools::UnitTesting;  
7:  namespace AClassUnitTest  
8:  {  
9:       [TestClass]  
10:       public ref class UnitTest1  
11:       {  
12:       public:   
13:            [TestMethod]  
14:            void TestMethod1()  
15:            {  
16:                 AClass* target = new AClass();  
17:                 string theName = "The Class";  
18:                 Assert::IsTrue(target->setName(theName));  
19:                 delete target;  
20:            };  
21:            [TestMethod]  
22:            void TestMethod2()  
23:            {  
24:                 AClass* target = new AClass();  
25:                 string response = "";  
26:                 Assert::IsFalse(target->getName(response));  
27:                 delete target;  
28:            };  
29:            [TestMethod]  
30:            void TestMethod3()  
31:            {  
32:                 AClass* target = new AClass();  
33:                 target->setDouble(10.0);  
34:                 Assert::AreEqual(target->getDouble(), 10.0);  
35:                 delete target;  
36:            };  
37:       };  
38:  }

A very similar example of that I have described above can be found here. Something that is worth noting is the first unit test, TestMethod1. The setName method takes a reference to a native type string, in my first attempt this was not a reference but the type was copied instead. However when the copy of the string passed out of scope in the setName it caused an exception to be thrown when it should be deleted. I was not able to find the exact reason why this occurred but it seemed to be due to calling delete on an already deleted object. This was only a problem when the DLL was used from a unit test, using a C++/CLI console application to call the DLL or an ordinary CLR console application did not cause the error.

This can be due to the fact that the native string class is in fact a template class which can cause problems when they pass native to managed borders. This is usually not such a big problem since it is better to pass the string as a constant reference since it can become quite large. However if you are not in possession of the source code then this throws a serious wrench in the machinery.