20 February 2025
When developing a code translator from C# to Java, there are no issues with deleting unused objects: Java provides a garbage collection mechanism that is sufficiently similar to the one in C#, and the translated code using classes simply compiles and works. C++ is a different story. Clearly, mapping references to raw pointers will not yield the desired results, as such translated code will not delete anything. Meanwhile, C# developers, accustomed to working in a GC environment, will continue writing code that creates many temporary objects.
To ensure timely deletion of objects in the converted code, we had to choose among three options:
The third option was immediately dismissed: the complexity of algorithms in the ported libraries proved prohibitive. Additionally, static analysis would need to extend to client code using these converted libraries.
The second option also appeared impractical: since we were porting libraries rather than applications, imposing a garbage collector would introduce constraints on client code using these libraries. Experiments in this direction were deemed unsuccessful.
Thus, we arrived at the last remaining option—using smart pointers with reference counting, which is fairly typical for C++. This, in turn, meant that to solve the problem of circular references, we would need to use weak references in addition to strong ones.
There are several well-known types of smart pointers:
shared_ptr
might seem like the most obvious choice, but it has a significant drawback: it stores the reference counter on the heap, separate from the object, even when using enable_shared_from_this
. Allocating and deallocating memory for the reference counter is a relatively expensive operation.intrusive_ptr
is better in this regard, as having an unused 4/8-byte field within the structure is a lesser evil compared to the overhead of an additional allocation for each temporary object.Now, consider the following C# code:
class Document
{
private Node node;
public Document()
{
node = new Node(this);
}
public void Prepare(Node n) { ... }
}
class Node
{
private Document document;
public Node(Document d)
{
document = d;
d.Prepare(this);
}
}
When using intrusive_ptr
, this code would be translated into something like the following:
class Document : public virtual System::Object
{
intrusive_ptr<Node> node;
public:
Document()
{
node = make_shared_intrusive<Node>(this);
}
void Prepare(intrusive_ptr<Node> n) { ... }
};
class Node : public virtual System::Object
{
intrusive_ptr<Document> document;
public:
Node(intrusive_ptr<Document> d)
{
document = d;
d->Prepare(this);
}
};
Here, three issues become immediately apparent:
Node::document
a weak reference.this
into an intrusive_ptr
(analogous to shared_from_this
). If we instead start changing method signatures (e.g., making Document::Prepare
accept Node*
instead of intrusive_ptr<Node>
), problems will arise when calling the same methods with already constructed objects or managing object lifetimes.this
into an intrusive_ptr
during object construction, followed by decrementing the reference count to zero (as happens, for example, in the Node
constructor when exiting Document::Prepare
), must not immediately delete the partially constructed object, which has no external references yet.The first issue was addressed manually, as even a human often struggles to determine which of several references should be weak. In some cases, there is no clear answer, requiring changes to the C# code.
For example, in one project, there was a pair of classes: “print action” and “print action parameters.” The constructor of each created the paired object and established bidirectional references. Clearly, turning one of these references into a weak one would break the usage scenario. Ultimately, we decided to use the [CppWeakPtr]
attribute, instructing the translator that the corresponding field should contain a weak reference instead of a strong one.
The second problem is easily solved if intrusive_ptr
allows conversion from a raw pointer, which this
is. The Boost implementation provides this capability.
Finally, the third problem was resolved by introducing a local RAII guard variable in the constructor. This guard increments the reference count of the current object upon creation and decrements it upon destruction. Importantly, decrementing the reference count to zero within the guard does not delete the protected object.
With these changes, the code before and after translation looks roughly like this:
class Document
{
private Node node;
public Document()
{
node = new Node(this);
}
public void Prepare(Node n) { ... }
}
class Node
{
[CppWeakPtr] private Document document;
public Node(Document d)
{
document = d;
d.Prepare(this);
}
}
class Document : public virtual System::Object
{
intrusive_ptr<Node> node;
public:
Document()
{
System::Details::ThisProtector guard(this);
node = make_shared_intrusive<Node>(this);
}
void Prepare(intrusive_ptr<Node> n) { ... }
};
class Node : public virtual System::Object
{
weak_intrusive_ptr<Document> document;
public:
Node(intrusive_ptr<Document> d)
{
System::Details::ThisProtector guard(this);
document = d;
d->Prepare(this);
}
};
Thus, as long as any implementation of intrusive_ptr
meets our requirements and is complemented by a paired weak_intrusive_ptr
class, it will suffice. The latter must rely on a reference counter located on the heap outside the object. Since creating weak references is a relatively rare operation compared to creating temporary objects, separating the reference counter into a strong one (inside the object) and a weak one (outside the object) provided a performance boost in real-world code.
The situation becomes significantly more complicated because we need to translate code for generic classes and methods, where type parameters can be either value types or reference types. For example, consider the following C# code:
class MyContainer<T>
{
public T field;
public void Set(T val)
{
field = val;
}
}
class MyClass {}
struct MyStruct {}
var a = new MyContainer<MyClass>();
var b = new MyContainer<MyStruct>();
A straightforward porting approach yields the following result:
template <typename T> class MyContainer : public virtual System::Object
{
public:
T field;
void Set(T val)
{
field = val;
}
};
class MyClass : public virtual System::Object {};
class MyStruct : public System::Object {};
auto a = make_shared_intrusive<MyContainer<MyClass>>();
auto b = make_shared_intrusive<MyContainer<MyStruct>>();
Clearly, this code will not behave the same way as the original, because when instantiating MyContainer<MyClass>
, the field
object moves from the heap into the MyContainer
field, breaking the reference copying semantics. At the same time, placing the MyStruct
structure in the field is entirely correct, as it aligns with C# behavior.
This situation can be resolved in two ways:
MyContainer<MyClass>
to the semantics of MyContainer<intrusive_ptr<MyClass>>
:auto a = make_shared_intrusive<MyContainer<intrusive_ptr<MyClass>>>();
template <typename T, bool is_T_reference_type = is_reference_type_v<T>> class MyContainer : public virtual System::Object
{
public:
T field;
void Set(T val)
{
field = val;
}
};
template <typename T> class MyContainer<T, true> : public virtual System::Object
{
public:
intrusive_ptr<T> field;
void Set(intrusive_ptr<T> val)
{
field = val;
}
};
Besides the verbosity, which grows exponentially with each additional type parameter, the second approach has the drawback that every context using MyContainer<T>
must know whether T
is a value type or a reference type, which is often undesirable. For example, when we want to minimize the number of included headers or completely hide information about certain internal types.
Additionally, the choice of reference type (strong or weak) can only be made once per container. This means it becomes impossible to have both a List
of strong references and a List
of weak references, even though the code of the converted products requires both variants.
Considering these factors, it was decided to port MyContainer<MyClass>
using the semantics of MyContainer<System::SharedPtr<MyClass>>
or MyContainer<System::WeakPtr<MyClass>>
for weak references. Since the most popular libraries do not provide pointers with the required characteristics, we developed our own implementations, named System::SharedPtr
—a strong reference using an in-object reference counter—and System::WeakPtr
—a weak reference using an external reference counter. The System::MakeObject
function is responsible for creating objects in the style of std::make_shared
.