27 March 2025

Porting C# Code to C++: The SmartPtr Implementation

From the very beginning, the task involved porting several projects containing up to several million lines of code. Essentially, the technical specification for the translator boiled down to the phrase "ensure all of this ports and runs correctly in C++". The work of those responsible for releasing C++ products involves translating the code, running tests, preparing release packages, and so on. The problems encountered typically fall into one of several categories:

  1. The code doesn't translate to C++ - the translator terminates with an error.
  2. The code translates to C++, but it doesn't compile.
  3. The code compiles, but it doesn't link.
  4. The code links and runs, but tests fail, or runtime crashes occur.
  5. Tests pass, but issues arise during their execution that are not directly related to the product's functionality. Examples include: memory leaks, poor performance, etc.

Progression through this list is top-down — for example, without resolving compilation issues in the translated code, it's impossible to verify its functionality and performance. Consequently, many long-standing problems were only discovered during the later stages of working on the CodePorting.Translator Cs2Cpp project.

Initially, when fixing simple memory leaks caused by circular dependencies between objects, we applied the CppWeakPtr attribute to fields, resulting in fields of type WeakPtr. As long as WeakPtr could be converted to SharedPtr by calling the lock() method or implicitly (which is syntactically more convenient), this didn't cause problems. However, we later also had to make references contained within containers weak, using special syntax for the CppWeakPtr attribute, and this led to a couple of unpleasant surprises.

The first sign of trouble with our adopted approach was that, from a C++ perspective, MyContainer<SharedPtr<MyClass>> and MyContainer<WeakPtr<MyClass>> are two different types. Consequently, they cannot be stored in the same variable, passed to the same method, returned from it, and so on. The attribute, originally intended solely for managing how references are stored in object fields, started appearing in increasingly odd contexts, affecting return values, arguments, local variables, etc. The translator code responsible for handling it became more complex day by day.

The second problem was also something we hadn't anticipated. For C# programmers, it turned out to be natural to have a single associative collection per object, containing both unique references to objects owned by the current object and inaccessible otherwise, as well as references to parent objects. This was done to optimize read operations from certain file formats, but for us, it meant that the same collection could contain both strong and weak references. The pointer type ceased to be the final determinant of its operating mode.

Reference Type as Part of Pointer State

Clearly, these two problems couldn't be solved within the existing paradigm, and the pointer types were reconsidered. The result of this revised approach was the SmartPtr class, featuring a set_Mode() method that accepts one of two values: SmartPtrMode::Shared and SmartPtrMode::Weak. All SmartPtr constructors accept these same values. Consequently, each pointer instance can exist in one of two states:

  1. Strong reference: the reference counter is encapsulated within the object;
  2. Weak reference: the reference counter is external to the object.

Switching between modes can occur at runtime and at any moment. The weak reference counter is not created until at least one weak reference to the object exists.

The full list of features supported by our pointer is as follows:

  1. Strong reference storage: object lifetime management via reference counting.
  2. Weak reference storage for an object.
  3. intrusive_ptr semantics: any number of pointers created for the same object will share a single reference counter.
  4. Dereferencing and the arrow operator (->): for accessing the object pointed to.
  5. A full set of constructors and assignment operators.
  6. Separation of the pointed-to object and the reference-counted object (aliasing constructor): since our clients' libraries work with documents, it's often necessary for a pointer to a document element to keep the entire document alive.
  7. A full set of casts.
  8. A full set of comparison operations.
  9. Assignment and deletion of pointers: operate on incomplete types.
  10. A set of methods for checking and changing the pointer's state: aliasing mode, reference storage mode, object reference count, etc.

The SmartPtr class is templated and contains no virtual methods. It is tightly coupled with the System::Object class, which handles reference counter storage, and works exclusively with its derived classes.

There are deviations from typical pointer behavior:

  1. Moving (move constructor, move assignment operator) does not change the entire state; it preserves the reference type (weak/strong).
  2. Accessing an object via a weak reference does not require locking (creating a temporary strong reference), because an approach where the arrow operator returns a temporary object severely degrades performance for strong references.

To maintain compatibility with old code, the SharedPtr type became an alias for SmartPtr. The WeakPtr class now inherits from SmartPtr, adding no fields, and merely overrides the constructors to always create weak references.

Containers are now always ported with MyContainer<SmartPtr<MyClass>> semantics, and the type of stored references is chosen at runtime. For containers written manually based on STL data structures (primarily containers from the System namespace), the default reference type is set using a custom allocator, while still allowing the mode to be changed for individual container elements. For translated containers, the necessary code for switching the reference storage mode is generated by the translator.

The drawbacks of this solution primarily include reduced performance during pointer creation, copying, and deletion operations, as a mandatory check of the reference type is added to the usual reference counting. Specific numbers heavily depend on the test structure. Discussions are currently underway about generating more optimal code in places where the pointer type is guaranteed not to change.

Preparing Code for Translation

Our porting method requires manually placing attributes in the source C# code to mark where references should be weak. Code where these attributes are not correctly placed will cause memory leaks and, in some cases, other errors after translation. Code with attributes looks something like this:

struct S {
    MyClass s; // Strong reference to object

    [CppWeakPtr]
    MyClass w; // Weak reference to object

    MyContainer<MyClass> s_s; // Strong reference to a container of strong references

    [CppWeakPtr]
    MyContainer<MyClass> w_s; // Weak reference to a container of strong references

    [CppWeakPtr(0)]
    MyContainer<MyClass> s_w; // Strong reference to a container of weak references

    [CppWeakPtr(1)]
    Dictionary<MyClass, MyClass> s_s_w; // Strong reference to a container where keys are stored by strong references, and values by weak references

    [CppWeakPtr, CppWeakPtr(0)]
    Dictionary<MyClass, MyClass> w_w_s; // Weak reference to a container where keys are stored by weak references, and values by strong references
}

In some cases, it's necessary to manually call the SmartPtr class's aliasing constructor or its method that sets the stored reference type. We try to avoid editing the ported code, as such changes have to be reapplied after each translator run. Instead, we aim to keep such code within the C# source. We have two ways to achieve this:

  1. We can declare a service method in C# code that does nothing, and during translation, replace it with a manually written equivalent that performs the necessary operation:
class Service {
    public static void SetWeak<T>(T arg) {}
}
class Service {
public:
    template <typename T> static void SetWeak(SmartPtr<T> &arg)
    {
        arg.set_Mode(SmartPtrMode::Weak);
    }
};
  1. We can place specially formatted comments in the C# code, which the translator converts into C++ code:
class MyClass {
    private Dictionary<string, object> data;
    public void Add(string key, object value)
    {
        data.Add(key, value);
        //CPPCODE: if (key == u"Parent") data->data()[key].set_Mode(SmartPtrMode::Weak);
    }
}

Here, the data() method in System::Collections::Generic::Dictionary returns a reference to the underlying std::unordered_map of this container.

Related News

Related Videos

Related Articles