C# vs. CLR
Summary
While it is easy to think of the C# and the Common Language Runtime (CLR) as one cohesive unit, there are difference between the semantics of the CLR and the semantics of the Common Type System (CTS) in the CLR. The way in which the C# compiler implements its semantics on top of the CTS are observable and may be surprising. Specifically, the act of creating a sub class can change the definition of the base class.
The long story
The SDK for my work robot implments its own RPC system to talk to the
embedded controller. The interface we expose is a proxy built on top of the
RealProxy
and TransparentProxy
classes from .NET remoting. I decided to
switch these proxies to be built on top of the DynamicProxy from CoreFX.
One of the reasons I made this change was to allow some methods of the class to
be implemented on the client side. The idea is that a base class of the proxy can implement
just the methods from an interface it wants to execute locally. The generated proxy
subclass will fill in all the methods the base class did not define and complete the
implemention of the interface. In C#, it's perfectly valid to have a subclass implement an
interface in this way, even if the methods on the base class are not virtual
.
However, the CreateType method on TypeBuilder
was throwing
an System.TypeLoadException
exception, with the error complaining that the method
"does not have an implementation". I was able to fix this by marking the relevent
members are virtual
, however, I was not able to reproduce the exception in my
simple test program:
using System;
using System.Reflection;
using System.Reflection.Emit;
class Program
{
static void Main(string[] args)
{
var asm = AppDomain.CurrentDomain.DefineDynamicAssembly(
new AssemblyName("testasm"), AssemblyBuilderAccess.Run);
var mod = asm.DefineDynamicModule("testmod");
var tb = mod.DefineType("MyGeneratedType",
TypeAttributes.Public, typeof(MyBaseClass));
tb.AddInterfaceImplementation(typeof(IHasName));
var instance = (IHasName)Activator.CreateInstance(
tb.CreateType());
Console.WriteLine(instance.Name);
}
}
public interface IHasName
{
string Name { get; }
}
public class MyBaseClass
{
public string Name { get { return "Inigo Montoya"; } }
}
public class MySubClass : MyBaseClass, IHasName { }
In this little program, I have a class MyBaseClass
that has the implemention
of interface IHasName
, but does not actually implement it. Also note that the
Name
property is not marked as virtual. This program creates two subclasses of
MyBaseClass
that implement IHasName
:
MySubClass
is created using C#.MyGeneratedType
is created usingSystem.Reflection.Emit
.
The System.Reflection.Emit
method appeared to work in the same way as the C#
version - until I commented out the definition of MySubClass
on line 26. Oddly, the
existence of this sub class determined whether or not the generated subclass was
able to implement the interface!
What's going on here?
Obviously, the C# compiler was doing more than I expected. To find out what it was
doing, I compiled the program twice, once with the subclass and once without. I
then used ILDasm
to dump the Microsoft Intermediate Langauge (MSIL) represention
of the programs and diffed them. Below is the relevant portion of the diff
between the two:
--- a/no_subclass.il
+++ b/subclass_exists.il
@@ -128,7 +128,7 @@
.class public auto ansi beforefieldinit MyBaseClass
extends [mscorlib]System.Object
{
- .method public hidebysig specialname
+ .method public hidebysig newslot specialname virtual final
instance string get_Name() cil managed
{
// Code size 11 (0xb)
By adding the sub class, the member on the base class is now marked as newslot
,
virtual
, and final
. The newslot
and virtual
keywords make this method
appear in the V-Table so that it can take part in dynamic dispatch, but the final
keyword makes the member respect the C# code's wish to make this member non-overrideable.
These contradictory attributes are reminiscent of how a static class in C# is
implemented by marking the class as both sealed
and abstract
.
Cross Assembly Inheritence
After seeing how the C# compiler implements interfaces on non-virtual methods when both the base class and sub class live in the same assembly, the obvious question to ask is how this works when the base class is in a different assembly. Surely the other assembly is not modified, yet this scenario works. The code that the C# compiler generates in this case is roughly equivalent to explicitly implementing the interface and forwarding the call to the base class:
public class MySubClass : MyBaseClass, IHasName
{
string IHasName.Name
{
get { return base.Name; }
}
}
I say "roughly equivalent" because there is a small difference between code generated
by the compiler and what you are able to express using C#. The above code generates
both a property called Name
and a method called get_Name
. If you leave it up
to the C# compiler, however, you get only the method named get_Name
.
Why do I care?
These sort of details affect you if you are creating code-generating tools that directly generate .NET Classes and MSIL without going through a C# compiler. You have to be aware of the division of responsibility between the C# compiler and the CLR when trying to emulate the semantics of C# with your code generator. In my case, I simplified things by making a rule that all members on the base class have to be marked as virtual. This is easy to verify with automated testing and frees me from having to generate stub functions to emulate the C# behavior.
Acknowledgements
Thanks to Caspar for reviewing drafts of this post.