Value types and parameterless constructors

There have been a couple of questions on StackOverflow about value types and parameterless constructors:

I learned quite a bit when answering both of these. When a further question about the default value of a type (particularly with respect to generics) came up, I thought it would be worth delving into a bit more depth. Very little of this is actually relevant most of the time, but it’s interesting nonetheless.

I won’t go over most of the details I discovered in my answer to the first question,  but if you’re interested in the IL generated by the statement “x = new Guid();” then have a look there for more details.

Let’s start off with the first and most important thing I’ve learned about value types recently:

Yes, you can write a parameterless constructor for a value type in .NET

I very carefully wrote “in .NET” there – “in C#” would have been incorrect. I had always believed that the CLI spec prohibited value types from having parameterless constructors. (The C# spec used the terminology in a slightly different way – it treats all value types as having a parameterless constructor. This makes the language more consistent for the most part, but it does give rise to some interesting behaviour which we’ll see later on.)

It turns out that if you write your value type in IL, you can provide your own parameterless constructor with custom code without ilasm complaining at all. It’s possible that other languages targeting the CLI allow you to do this as well, but as I don’t know any, I’ll stick to IL. Unfortunately I don’t know IL terribly well, so I thought I’d just start off with some C# and go from there:

public struct Oddity
{
    public Oddity(int removeMe)
    {
        System.Console.WriteLine(“Oddity constructor called”);
    }
}

I compiled that into its own class library, and then disassembled it with ildasm /out:Oddity.il Oddity.dll. After changing the constructor to be parameterless, removing a few comments, and removing some compiler-generated assembly attributes) I ended up with this IL:

.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )  
  .ver 2:0:0:0
}
.assembly Oddity
{
  .hash algorithm 0x00008004
  .ver 0:0:0:0
}
.module Oddity.dll
.imagebase 0x00400000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003
.corflags 0x00000001

.class public sequential ansi sealed beforefieldinit Oddity
       extends [mscorlib]System.ValueType
{
  .pack 0
  .size 1
  .method public hidebysig specialname rtspecialname 
          instance void  .ctor() cil managed
  {
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldstr      "Oddity constructor called"
    IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_000b:  nop
    IL_000c:  ret
  }
}

I reassembled this with ilasm /dll /out:Oddity.dll Oddity.il. So far, so good. We have a value type with a custom constructor in a class library. It doesn’t do anything particularly clever – it just logs that it’s been called. That’s enough for our test program.

When does the parameterless constructor get called?

There are various things one could investigate about parameterless constructors, but I’m mostly interested in when they get called. The test application is reasonably simple, but contains lots of cases – each writes to the console what it’s about to do, then does something which might call the constructor. Without further ado:

using System;
using System.Runtime.CompilerServices;

class Test
{
    static Oddity staticField;
    Oddity instanceField;
   
    static void Main()
    {
        Report(“Declaring local variable”);
        Oddity localDeclarationOnly;
        // No variables within the value, so we can use it
        // without inializing anything
        Report(“Boxing”);
        object o = localDeclarationOnly;
        // Just make sure it’s really done it
        Report(o.ToString());
        Report(“new Oddity() – set local variable”);
        Oddity local = new Oddity();
        Report(“Create instance of Test – contains instance variable”);
        Test t = new Test();
        Report(“new Oddity() – set instance field”);
        t.instanceField = new Oddity();
        Report(“new Oddity() – set static field”);
        staticField = new Oddity();
        Report(“new Oddity[10]”);
        o = new Oddity[10];
        Report(“Passing argument to method”);
        MethodWithParameter(local);
        GenericMethod<Oddity>();
        GenericMethod2<Oddity>();
        Report(“Activator.CreateInstance(typeof(Oddity))”);
        Activator.CreateInstance(typeof(Oddity));
        Report(“Activator.CreateInstance<Oddity>()”);
        Activator.CreateInstance<Oddity>();
    }
   
    [MethodImpl(MethodImplOptions.NoInlining)]
    static void MethodWithParameter(Oddity oddity)
    {
        // No need to do anything
    }
   
    static void GenericMethod<T>() where T : new()
    {
        Report(“default(T) in generic method with new() constraint”);
        T t = default(T);
        Report(“new T() in generic method with new() constraint”);
        t = new T();
    }
   
    static void GenericMethod2<T>() where T : struct
    {
        Report(“default(T) in generic method with struct constraint”);
        T t = default(T);
        Report(“new T() in generic method with struct constraint”);
        t = new T();
    }

    static void Report(string text)
    {
        Console.WriteLine(text);
    }
}

And here are the results:

Declaring local variable
Boxing
Oddity
new Oddity() – set local variable
Oddity constructor called
Create instance of Test – contains instance variable
new Oddity() – set instance field
Oddity constructor called
new Oddity() – set static field
Oddity constructor called
new Oddity[10]
Passing argument to method
default(T) in generic method with new() constraint
new T() in generic method with new() constraint
default(T) in generic method with struct constraint
new T() in generic method with struct constraint
Activator.CreateInstance(typeof(Oddity))
Oddity constructor called
Activator.CreateInstance<Oddity>()
Oddity constructor called

So, to split these out:

Operations which do call the constructor

  • new Oddity() – whatever we’re storing the result in. This isn’t much of a surprise. What may surprise you is that it gets called even if you compile Test.cs against the original Oddity.dll (without the custom parameterless constructor) and then just rebuild Oddity.dll.
  • Activator.CreateInstance<T>() and Activator.CreateInstance(Type). I wouldn’t be particular surprised by this either way.

Operations which don’t call the constructor

  • Just declaring a variable, whether local, static or instance
  • Boxing
  • Creating an array – good job, as this could be a real performance killer
  • Using default(T) in a generic method - this one didn't surprise me
  • Using new T() in a generic method – this one really did surprise me. Not only is it counterintuitive, but in IL it just calls Activator.CreateInstance<T>(). What’s the difference between this and calling Activator.CreateInstance<Oddity>()? I really don’t understand.

Conclusions

Well, I’m still glad that C# doesn’t let us define our own parameterless constructors for value types, given the behaviour. The main reason for using it – as far as I’ve seen – it to make sure that the “default value” for a type is sensible. Given that it’s possible to get a usable value of the type without the constructor being called, this wouldn’t work anyway. Writing such a constructor would be like making a value type mutable – almost always a bad idea.

However, it’s nice to know it’s possible, just on the grounds that learning new things is always a good thing. And at least next time someone asks a similar question, I’ll have somewhere to point them…

11 thoughts on “Value types and parameterless constructors”

  1. I’m with you on the consistency argument. If a struct is going to be a bunch of bytes that start out zeroed, the parameterless constructor would have to be called as soon as you declare “Oddity x;” – that is, without calling “new Oddity()” or “default(Oddity)”, which is bad for consistency and arguably a bigger gotcha.

    Like

  2. Wow that is surprising, I also thought it was a CLI requirment not to define a parameterless constructor. I ran peverify on the resulting DLL and it’s perfectly happy as well.

    Like

  3. @Motti: No, I really don’t think so. Where’s the bug? Which part is not working as designed? The fact that from most managed languages you can’t even write a parameterless constructor is enough to show that this blog post is about relatively unexplored territories… there are a couple of surprises, but they shouldn’t be biting people anyway.

    Like

  4. Motti: I guess that depends on the perspective.

    From a C++ perspective, declaring a variable also means fairly clearly creating an object or allocating space for a primitive; both on the stack, whereas using ‘new’ confers, uh, “heapness”. The way C# is built on explicitly using ‘new’ the entire time, not using it is inconsistent, and the two rules ‘always use new or default(T) to call a constructor’ and ‘declaring a struct or an array always allocates space’ collide.

    If you’d want C# to desperately be C++ or at least share the same syntax, I would agree that it looks like a bug report. But the distinction that’s useful and relevant in C++ would be far less so in C#, and the payoff would be dramatically smaller. I think it’s too high of a price to pay to get parameterless constructors.

    Like

  5. As I pointed out in your non-nullable generic type wrapper this “hack” could be used to make it semi-perfect. In my example all I did was make the empty constructor throw an exception, mark it with obsolete (error=true) for design-time support, and finally add a constructor with a single parameter. The obsolete thing is just to make this behavior documented. However, I guess it might be possible to delete the default constructor altogether.

    Like

  6. Regarding the different behaviour of new T(), the IL for a non-generic new Oddity() puts in a special “valuetype” prefix which must somehow make CreateInstance behave differently. Whereas the generic code doesn’t know if T is a valuetype or not. And there’s no way to tell C# that T is a struct AND has a constructor that accepts no arguments, because it won’t let you specify both constraints.

    Like

  7. and structs cannot have instance fields initializers either…

    I guess it could be problematic in some very rare cases where you’d want to keep the ‘stackability’ of the struct and the ability to initialize some fields during construction
    and could en up implementing a constructor with a dummy argument… or consider making it a reference type

    Like

Leave a comment