20 February 2025

Porting C# Code to C++: Smart Pointers

When developing a code translator from C# to Java, there are no issues with deleting unused objects: Java provides a garbage collection mechanism that is sufficiently similar to the one in C#, and the translated code using classes simply compiles and works. C++ is a different story. Clearly, mapping references to raw pointers will not yield the desired results, as such translated code will not delete anything. Meanwhile, C# developers, accustomed to working in a GC environment, will continue writing code that creates many temporary objects.

To ensure timely deletion of objects in the converted code, we had to choose among three options:

  1. Use reference counting for objects—for example, via smart pointers;
  2. Use a garbage collector implementation for C++—for example, Boehm GC;
  3. Use static analysis to determine the points where objects should be deleted.

The third option was immediately dismissed: the complexity of algorithms in the ported libraries proved prohibitive. Additionally, static analysis would need to extend to client code using these converted libraries.

The second option also appeared impractical: since we were porting libraries rather than applications, imposing a garbage collector would introduce constraints on client code using these libraries. Experiments in this direction were deemed unsuccessful.

Thus, we arrived at the last remaining option—using smart pointers with reference counting, which is fairly typical for C++. This, in turn, meant that to solve the problem of circular references, we would need to use weak references in addition to strong ones.

Types of Smart Pointers

There are several well-known types of smart pointers:

  • shared_ptr might seem like the most obvious choice, but it has a significant drawback: it stores the reference counter on the heap, separate from the object, even when using enable_shared_from_this. Allocating and deallocating memory for the reference counter is a relatively expensive operation.
  • intrusive_ptr is better in this regard, as having an unused 4/8-byte field within the structure is a lesser evil compared to the overhead of an additional allocation for each temporary object.

Now, consider the following C# code:

class Document
{
    private Node node;
    public Document()
    {
        node = new Node(this);
    }
    public void Prepare(Node n) { ... }
}

class Node
{
    private Document document;
    public Node(Document d)
    {
        document = d;
        d.Prepare(this);
    }
}

When using intrusive_ptr, this code would be translated into something like the following:

class Document : public virtual System::Object
{
    intrusive_ptr<Node> node;
public:
    Document()
    {
        node = make_shared_intrusive<Node>(this);
    }
    void Prepare(intrusive_ptr<Node> n) { ... }
};

class Node : public virtual System::Object
{
    intrusive_ptr<Document> document;
public:
    Node(intrusive_ptr<Document> d)
    {
        document = d;
        d->Prepare(this);
    }
};

Here, three issues become immediately apparent:

  1. A mechanism is needed to break the circular reference, in this case by making Node::document a weak reference.
  2. There must be a way to convert this into an intrusive_ptr (analogous to shared_from_this). If we instead start changing method signatures (e.g., making Document::Prepare accept Node* instead of intrusive_ptr<Node>), problems will arise when calling the same methods with already constructed objects or managing object lifetimes.
  3. Converting this into an intrusive_ptr during object construction, followed by decrementing the reference count to zero (as happens, for example, in the Node constructor when exiting Document::Prepare), must not immediately delete the partially constructed object, which has no external references yet.

The first issue was addressed manually, as even a human often struggles to determine which of several references should be weak. In some cases, there is no clear answer, requiring changes to the C# code.
For example, in one project, there was a pair of classes: “print action” and “print action parameters.” The constructor of each created the paired object and established bidirectional references. Clearly, turning one of these references into a weak one would break the usage scenario. Ultimately, we decided to use the [CppWeakPtr] attribute, instructing the translator that the corresponding field should contain a weak reference instead of a strong one.

The second problem is easily solved if intrusive_ptr allows conversion from a raw pointer, which this is. The Boost implementation provides this capability.

Finally, the third problem was resolved by introducing a local RAII guard variable in the constructor. This guard increments the reference count of the current object upon creation and decrements it upon destruction. Importantly, decrementing the reference count to zero within the guard does not delete the protected object.

With these changes, the code before and after translation looks roughly like this:

class Document
{
    private Node node;
    public Document()
    {
        node = new Node(this);
    }
    public void Prepare(Node n) { ... }
}

class Node
{
    [CppWeakPtr] private Document document;
    public Node(Document d)
    {
        document = d;
        d.Prepare(this);
    }
}
class Document : public virtual System::Object
{
    intrusive_ptr<Node> node;
public:
    Document()
    {
        System::Details::ThisProtector guard(this);
        node = make_shared_intrusive<Node>(this);
    }
    void Prepare(intrusive_ptr<Node> n) { ... }
};

class Node : public virtual System::Object
{
    weak_intrusive_ptr<Document> document;
public:
    Node(intrusive_ptr<Document> d)
    {
        System::Details::ThisProtector guard(this);
        document = d;
        d->Prepare(this);
    }
};

Thus, as long as any implementation of intrusive_ptr meets our requirements and is complemented by a paired weak_intrusive_ptr class, it will suffice. The latter must rely on a reference counter located on the heap outside the object. Since creating weak references is a relatively rare operation compared to creating temporary objects, separating the reference counter into a strong one (inside the object) and a weak one (outside the object) provided a performance boost in real-world code.

Templates

The situation becomes significantly more complicated because we need to translate code for generic classes and methods, where type parameters can be either value types or reference types. For example, consider the following C# code:

class MyContainer<T>
{
    public T field;
    public void Set(T val)
    {
        field = val;
    }
}
class MyClass {}
struct MyStruct {}

var a = new MyContainer<MyClass>();
var b = new MyContainer<MyStruct>();

A straightforward porting approach yields the following result:

template <typename T> class MyContainer : public virtual System::Object
{
public:
    T field;
    void Set(T val)
    {
        field = val;
    }
};
class MyClass : public virtual System::Object {};
class MyStruct : public System::Object {};

auto a = make_shared_intrusive<MyContainer<MyClass>>();
auto b = make_shared_intrusive<MyContainer<MyStruct>>();

Clearly, this code will not behave the same way as the original, because when instantiating MyContainer<MyClass>, the field object moves from the heap into the MyContainer field, breaking the reference copying semantics. At the same time, placing the MyStruct structure in the field is entirely correct, as it aligns with C# behavior.

This situation can be resolved in two ways:

  1. Transition from the semantics of MyContainer<MyClass> to the semantics of MyContainer<intrusive_ptr<MyClass>>:
auto a = make_shared_intrusive<MyContainer<intrusive_ptr<MyClass>>>();
  1. Create two specializations for each template class: one for cases where the type argument is a value type, and another for cases where it is a reference type:
template <typename T, bool is_T_reference_type = is_reference_type_v<T>> class MyContainer : public virtual System::Object
{
public:
    T field;
    void Set(T val)
    {
        field = val;
    }
};

template <typename T> class MyContainer<T, true> : public virtual System::Object
{
public:
    intrusive_ptr<T> field;
    void Set(intrusive_ptr<T> val)
    {
        field = val;
    }
};

Besides the verbosity, which grows exponentially with each additional type parameter, the second approach has the drawback that every context using MyContainer<T> must know whether T is a value type or a reference type, which is often undesirable. For example, when we want to minimize the number of included headers or completely hide information about certain internal types.
Additionally, the choice of reference type (strong or weak) can only be made once per container. This means it becomes impossible to have both a List of strong references and a List of weak references, even though the code of the converted products requires both variants.

Considering these factors, it was decided to port MyContainer<MyClass> using the semantics of MyContainer<System::SharedPtr<MyClass>> or MyContainer<System::WeakPtr<MyClass>> for weak references. Since the most popular libraries do not provide pointers with the required characteristics, we developed our own implementations, named System::SharedPtr—a strong reference using an in-object reference counter—and System::WeakPtr—a weak reference using an external reference counter. The System::MakeObject function is responsible for creating objects in the style of std::make_shared.

Related News

Related Videos

Related Articles