jordan.terrell
Just trying to make sense of things...

Maybe – The Uh, Stuff That Dreams Are Made Of

Sunday, 22 May 2011 17:31 by jordan.terrell

UPDATE 12/10/2011: I’ve continued development of the Maybe<T> structure and accompanying operators.  As a result, some of the information in this post is no longer accurate.  That said, the information is still useful as it conveys many of the concepts and practical applications of the Maybe monad.  A future post will highlight the changes in my implementation and some of the lessons I’ve learned since writing this post.



Explorers (1985) film - one of my favorite childhood movies. As I said, I’ve been having a blast learning more about programming in the functional style.  In fact, over the past 18 months I’ve rewired my brain to think functionally-oriented first and object-oriented second – not something I expected to happen.

Something that I came across was the concept of Monads.  Monads are everywhere in .NET now and it is a foundational concept in many of the really useful libraries and APIs that are coming out of Microsoft (LINQ, Task Parallel Library (TPL), Reactive Extensions (Rx), etc.).  I strongly recommend you start learning about Monads. That said, I’m not going to even try to explain Monads today.  I’m going to talk very practically about the Maybe monad, why you would use it, and my implementation of this pattern.

IEnumerable<T> - It’s Just A Function

How do you feel about the IEnumerable<T> interface?  Well, you may appreciate the fact that LINQ to Objects heavily relies on this interface and enables you to do some really cool things in just a few lines of code.  Many a programmer has taken a portion of a program that has numerous nested foreach loops with nested if statements and turned them into a really concise and readable LINQ statement. Thank you functional programming.

If you think about it, IEnumerable<T> is really just a function.  Yes, I know, it’s an interface.  But please suspend reality for a few moments and think of it like a function.  What does a function do?  It takes some input, and returns some output.  It represents a computation – the ability to compute some value in the future.  Many programming languages allow functions to return at most a single value.  What if we want to return more than one value?  We create data structures that allow us to return multiple values as a single atomic value, for example, an array.  However, arrays require that you compute all the values in the array before can return it from a function.  What if you wanted a function that calculates all prime numbers?  If we use an array, it would never return because we would need an array of infinite size.  Fortunately, we have IEnumerable<T> and iterators (that enable deferred execution) to help us with this.  So we can write a method, IEnumerable<int> GetPrimes(), and the object it returns (which implements IEnumerable<int>) we can think of it as being a function that can compute zero or more prime numbers at some point in the future when we enumerate over it (i.e. “execute” the function).

Part of that last sentence is what I want you to focus on - “compute zero or more”.  Looking at IEnumerable<T> generically, we could say that it represents the ability to compute zero or more values regardless of what type T is.  We can represent computations that potentially need to return more than one value.  That in and of itself is useful.  However, it becomes extremely powerful when you combine that interface with LINQ’s Standard Query Operators (e.g. Select, Where, GroupBy, Aggregate, etc).  Now we are able to express the intent of a fairly complex piece of logic in just a few lines of code.  We are expressing what we want the program to do and not the monotonous detail of how we want to do it. We are programming declaratively and with composition.  It is more maintainable because it is more readable and thus easier to reason on.  Plus, because we aren’t spelling out all the details, framework developers and compiler designers are able to make some intelligent decisions about how to implement the programmers intent, for example, running parts of the program in parallel (e.g. PLINQ).  No doubt you will agree there are many benefits to programming this way.

Zero or One

But what if we don’t want to compute zero or more values?  What if we want to compute zero or one value?  A classic example is reading from a dictionary.  Dictionary<TKey, TValue> has a TryGetValue() method that attempts to retrieve a value from the dictionary by key, and, if the value is not in the dictionary, it returns false.  Unfortunately because you need to return two bits of information (1. was the value in the dictionary, and 2. if so, what is the value), TryGetValue() uses awkward output arguments to return the value if it was in the dictionary.  Put in different words, we need a function (TryGetValue) to compute (lookup) zero or one value from the dictionary based on some key.

Another example might be retrieving data from a database.  Perhaps you have some function that returns a single CustomerDto from the database when you give the function a customer Id.  Well, we know we have to handle the scenario when there is no record in the database for the Id provided.  Often we just return null and that may be a perfectly logical way to represent zero customers with that customer Id.  However, we now have a “special” reference to a CustomerDto object – the null reference.  How many times has the dreaded NullReferenceException visited you?  To protect ourselves against that, we sprinkle in “null checks” all over the place.  It would be nice if we didn’t have to do that.  Nullable<T> is nice because it communicates a little better that “this value could be null” by generally requiring you to use the Value property (and hopefully you check the HasValue property before you use it).  Unfortunately, by design Nullable<T> does not work with reference types. That is because it is trying to model a null value (which reference types already have) and not the absence of a value.  Subtle difference, I know, but go with it for now.

In both of the examples above, we are conceptually trying to represent some computation (a function) that can compute (return) zero or one value.  IEnumerable<T> gives us the ability to represent zero or more values, but that is not what we want.  We need some type that represents zero or one value.

Introduction Maybe<T>

That type is Maybe<T>.  It works with all .NET types; value and reference types.  Here are some simple examples:

   1: Maybe<int> number = Maybe.Value(42);
   2: Maybe<string> text = Maybe.Value("Hello, World!");
   3:  
   4: if (number.HasValue) { Console.WriteLine(number.Value); }
   5: if (text.HasValue) { Console.WriteLine(text.Value); }
   6:  
   7: number = Maybe<int>.NoValue;
   8: text = Maybe<string>.NoValue;

"The uh, stuff that dreams are made of." - Quote from Maltese Falcon (never seen it) that was referenced in Explorers.

First, we see a simple integer value (42) placed into a Maybe<T> and assigned to the variable, “number”.  Next, we assigned another constant to a variable, “text”, but this time with a string, demonstrating that this works with both value and reference types.  Then, we write the contents of both variables out to the console, but only if they have a value (which we know they do in this case).  Finally, we re-assign the two variables to not have a value (NoValue), just to demonstrate how to compute the absence of a value.

At this point you might be thinking, “Big deal. I can just use null reference types and Nullable<T> to do the same thing.”  You would probably be right. However, many developers who have spent time with the Maybe monad pattern have found it to be a much more elegant solution.  For one, in my implementation, you will never get a null reference exception using Maybe<T> - it is implemented as a value type (struct) and as such cannot be null.  Second, the way to represent the absence of a value (zero) for any type, regardless of whether it is a value or reference type, is the same when you are using Maybe<T>.  Consistency is important for code readability and maintainability.  Finally, having a variable of type Maybe<T> communicates that it is possible for it to not have a value (‘no T for you…maybe’).

Having a variable that is a reference type, for example a string, doesn’t tell you whether or not you should expect it to contain the null reference.  If you are being a defensive programmer, you might check to see if it is null before using it - just in case.  Also, when you have a method that returns a reference type, the return type doesn’t tell you whether or not the method can or will return a null.  By using Maybe<T>, we can use the type system to clearly indicate that the method might not be able to compute (return) a value.  A great example of this is one I mentioned earlier – using dictionaries:

   1: var dictionary = new Dictionary<string, int>();
   2: Maybe<int> number = dictionary.TryGetValue("Foo");


I created an extension method, TryGetValue, that extends IDictionary<TKey, TValue> and takes only one argument - the key to use to lookup the value.  We know this lookup could fail.  The item we are looking for may not be in the dictionary and it would be better if we didn’t throw an exception if we can’t lookup the value.  That is why the TryGetValue extension method I created returns a Maybe<TValue> - if the value cannot be found in the dictionary it just returns Maybe<TValue>.NoValue.  The nice thing about using the Maybe<T> in an extension method is that it works with any type. It doesn’t matter if TValue is a value or reference type.  You don’t have to create an extension method for values types, and another one for reference types.

Did you notice we don’t have the awkward output arguments in the last example?  Maybe<T> is almost worth it just for that.  However, if you are still on the fence, I just might be able to convince you that Maybe<T> is “the uh, stuff that dreams are made of”.  Well, maybe just that it is really cool and useful.

Maybe Is Lazy and Doesn’t Forget

One nuance when using IEnumerable<T> is that you need to realize that the values it returns often are not computed until you enumerate over, also known as deferred executionMaybe<T> works the same way.  It often does not compute it’s value or the absence of a value until you try to use the HasValue, Value, or Exception (more on this in a moment) properties. A symptom of deferred execution is when you write some code with IEnumerable<T> or Maybe<T> expecting some side effect to occur and nothing happens.  Most of the Maybe<T> operators are lazy and if you want to force evaluation you can call Run() or RunAsync() (operators are discussed below).

Now, so far you’ve only seen examples that are do not use deferred execution.  Here is an (albeit contrived) example of deferred execution using Maybe<T>:

   1: Maybe<int> number = Maybe.Value(() => 42);

 

The Value method lets you pass in a Func<T> to calculate it’s value. In this example it only returns 42.  However, it could return something that is expensive to compute.  The function you provide will not get executed until something uses the HasValue, Value, or Exception properties – just like a deferred IEnumerable<T> does not execute until you call MoveNext() on the enumerator.

However, unlike IEnumerable<T>, once you use one of those properties your function’s results are cached and the function is never called again.  The Maybe<T> remembers its value.  It behaves more like Lazy<T> in this case.  In the vast majority of scenarios where you would use Maybe<T> this is the behavior that you want.  If you want to re-compute the value, build up the Maybe<T> again.

Exceptions ≈ NoValue

Sometimes when you are trying to compute some value bad stuff happens.  Exceptions are thrown.  One way to look at an exception in this scenario is that it is roughly equal to NoValue.  Example: You tried to get a row count from a database table, but the database is down and an exception is thrown.  You were not able to compute the row count because the database is not available.  Maybe<T> has an Exception property that allows you to capture the exception that prevented you from computing the row count. For example:

   1: Maybe<int> constant = new Maybe<int>(new InvalidOperationException());
   2: Maybe<int> lazy = Maybe.Value<int>(() => { throw new InvalidOperationException(); });

 

In both cases, the HasValue property is equal to false and the Exception property contains the exception.  In the interest of full disclosure, this feature is not part of the traditional implementation of the Maybe monad.  It is actually part of the Error monad.  That said, these monadic patterns are so complementary that it is useful to combine them into a single implementation.

Maybe<T> Entry Points

There are a few ways to create instances of Maybe<T>.  First, you can use constructors that take in a T, a Func<T>, or an Exception.  You can also use the Maybe.Value method overloads that take in a T or a Func<T> – this often allows the compiler to infer what type T is.  There are a few other ways to create a Maybe<T>.

Maybe.NotNull – This creates a Maybe<T> that treats null values as NoValue.  This is very helpful later when you are using other operators.  It has support for reference types and Nullable<T>NotNull can be used both as an entry point and as an operator (explained below).

Maybe.Using – This creates a Maybe<T> that can dispose of a resource after a value has been computed using the resource.  A classic example would be executing some scalar query against a database.  You want to ensure that the connection to the database is disposed after the scalar query is executed. Using can be used both as an entry point and as an operator.  Here is an example of using both NotNull and Using together:

   1: Maybe<int> rowCount = Maybe
   2:     .NotNull(connectionString)
   3:     .Using(cns => ConnectToDb(cns), conn => conn.GetRowCount());
 

Maybe<T> Operators

Remember how LINQ Standard Query Operators (e.g. Select, Where, GroupBy, etc) make the IEnumerable<T> interface so much more powerful?  Well, the same is true of Maybe<T>.  I’ve developed a number of really powerful Standard Maybe Operators that make Maybe<T> just as powerful as IEnumerable<T>.  Keep in mind, all of these operators understand that sometimes there is no value to operate on.  Just like calling Select and Where against an empty IEnumerable<T> results in another empty IEnumerable<T>, calling Select and Where against a Maybe<T> that doesn’t have a value results in another Maybe<T> that doesn’t have a value.

I won’t be able to fully explain each operator in this post. In fact, I’m going to intentionally gloss over them quite a bit, but I have unit tests covering them so you can look there for examples.  Additionally, the examples below are written to make them easier to understand and not always for conciseness. I won’t use the var keyword, and sometimes I will create temporary variables to help comprehension.  I may end up dedicating a few blog posts to expanding on how each of these operators behave in detail, but for now this should give you an idea of what is possible.  Here is the current list of the Standard Maybe Operators:

NotNull, Using – These were explained above.  They both can be used as an operator and an entry point method.

Select, CoalesceSelect should be fairly obvious if you’ve done any LINQ development.  It takes the value, if there is one, and selects another value from it.  Coalesce combines Select with NotNull.  Example:

   1: Maybe<Category> parentCategory = Maybe.Value(product)
   2:     .Coalesce(x => x.Category)
   3:     .Select(x => x.Parent);

 

Where, UnlessWhere will also be familiar if you’ve done LINQ development before.  It takes a predicate function (Func<T, bool>) as an argument that is called with the value of the Maybe<T>, if there is a value, and if the function returns true it returns the original value.  If the function returns false, it returns NoValueUnless works roughly the same, except if flips the result of the predicate function you provide.  It takes a function (Func<T, bool>) as an argument that is called with the value of the Maybe<T>, if there is a value, and if the function returns false it returns the original value. If the function returns true, it returns NoValue.  Example:

   1: Maybe<string> message = Maybe.Value("Hello, World!")
   2:     .Where(x => x.StartsWith("H"))
   3:     .Unless(x => x.EndsWith("?"));

 

Or, JoinOr and Join allow you to combine two Maybe<T>s.  Or will return the value of the first Maybe<T>, unless it has no value, and then it will return the value of the second.  Join returns a Maybe that combines the two Maybes – either as a Maybe<Tuple<T, U>> or you can give it a function that takes the two values and combines them into a custom type.  Join returns NoValue if either of the two provided Maybes does not have a value. Example:

   1: Maybe<CustomerDto> cachedDto = dictionary.TryGetValue(id);
   2: Maybe<CustomerDto> retreivedDto = Maybe.Using(() => ConnectToDb(), conn => conn.GetCustomer(id));
   3:  
   4: Maybe<StockQuote> latestQuote = Maybe.Value(() => stockService.GetQuote("LNKD"));
   5:  
   6: Maybe<CustomerQuote> customerQuote = cachedDto.Or(retreivedDto)
   7:     .Join(latestQuote, (dto, quote) => new CustomerQuote(dto.Name, quote.Price));
   8:  
   9: if(customerQuote.HasValue)
  10: {
  11:     CustomerQuote q = customerQuote.Value;
  12:     Console.WriteLine("Quote for LinkedIn to {0} is: {1}", q.Name, q.Price);
  13: }
  14: else
  15:     Console.WriteLine("Unable to retreive quote.");

 

With – Executes an action against a selected value when evaluated. Example:

   1: Maybe<string> text = Maybe.Value("Hello, World!")
   2:     .With(x => x.Length, l => Console.WriteLine("Length of string: {0}", l));

 

When – Executes an action or replaces a value, on evaluation, when predicate is true. Example:

   1: Maybe<int> nearestOddNumberRoundingUp = Maybe.Value(() => Console.ReadLine())
   2:     .Select(x => int.Parse(x))
   3:     .When(x => x % 2 == 0, x => x + 1);

 

OnValue, OnNoValue, OnException – Executes an action or replaces a values, on evaluation, when the method’s condition is true. Example:

   1: Maybe<int> fortyTwo = Maybe<int>.NoValue
   2:     .OnNoValue(() => { throw new InvalidOperationException(); })
   3:     .OnException(ex => Maybe.Value(42))
   4:     .OnValue(x => Console.WriteLine(x));

 

ThrowOn, ThrowOnNoValue, ThrowOnException – Generally, the Maybe<T> operators prevent exceptions that are thrown from escaping the Maybe<T>, but instead treat the exception roughly like NoValueThrowOn, ThrowOnNoValue, ThrowOnException allows exceptions to immediately escape and bubble up to be handled by standard .NET exception handling. Example:

   1: Maybe<int> neverHaveValue = Maybe.Value("Hello, World!")
   2:     .Select(x => int.Parse(x))
   3:     .ThrowOnException();

 

Run, RunAsync – Since most operators don’t execute until the Maybe<T> is evaluated, Run and RunAsync force evaluation while still returning a Maybe<T> for further processing.  Both allow you to pass in an Action<T> to immediately execute some work on the value (if there is one).  RunAsync makes use of the Task Parallel Library to evaluate the Maybe<T> by default on the thread pool.  It optionally takes in a CancellationToken, TaskCreationOptions, or TaskScheduler to provide additional control over how it runs asynchronously.

   1: Maybe<int> writeToConsoleImmediately = Maybe.Value(() => Console.ReadLine())
   2:     .Select(x => int.Parse(x))
   3:     .OnValue(x => Console.WriteLine("You entered number: {0}.", x))
   4:     .OnException(ex => Console.WriteLine("You did not enter a number."))
   5:     .Run();

 

Synchronize – Although Maybe<T> only evaluates once just like Lazy<T> it is not thread-safe by default.  This is by design because there is overheard in thread synchronization.  If two or more threads try to evaluate the Maybe<T> at the same time, evaluation will execute multiple times.  To prevent this, use the Synchronize method. This is useful when evaluation is very expensive; for example, querying a database or doing some CPU intensive work.  Example:

   1: Maybe<BigInteger> = Maybe.Value(() => GetLargePrimeNumber())
   2:     .Synchronize();

 

Cast, OfTypeCast and OfType allow you to cast the value (if there is one) to another type.  OfType tries to safely cast the value; if it can’t, it returns NoValueCast tries to directly cast the value; if it fails, it results in a Maybe<T> that contains a cast exception. Example:

   1: object obj = "Hello, World!";
   2: Maybe<string> text = Maybe.Value(obj)
   3:     .Cast<string>();
 

Shedding Maybe<T>

At some point you are going to need to get at the underlying value.  There are a few ways of doing this.  All of the methods below evaluate the Maybe<T> immediately.

ToNullable – When you are dealing with Maybe<T> where T is a value type, ToNullable will convert the Maybe<T> into a Nullable<T>. If there is a value, it simply returns the value typed as Nullable<T>. If there is no value, it returns the null value typed as Nullable<T>.  Example:

   1: Nullable<int> value = Maybe<int>.NoValue
   2:     .ToNullable();

 

Assign – Assign will output the value (if there is one) thru an reference parameter. This is useful if, after calling Assign, you want to execute other Maybe<T> operators.

   1: int value = 0;
   2: Maybe<int> number = Maybe.Value(() => Console.ReadLine())
   3:     .Select(x => int.Parse(x))
   4:     .OnException(42)
   5:     .Assign(ref value);

 

Return – If the Maybe<T> has a value, Return simple returns the value.  If there is no value, Return returns the type’s default value (e.g. 0 for an integer).  Alternatively, you can provide a default value it should return if the Maybe<T> contains no value.  If the Maybe<T> contains an exception, the exception is thrown.

   1: int value = Maybe.Value("27")
   2:     .Select(x => int.Parse(x))
   3:     .Return(42);

 

HasValue / Value – The most obvious way to get the value out of the Maybe<T> is to use the Value property.  If there is no value, the Value property throws an InvalidOperationException.  If the Maybe<T> already contains an exception, it re-throws the exception.  Only if there is a value does the Value property succeed.  I recommend you check the HasValue property before using the Value property.

Call To Action

Many of the methods that I’ve discussed have multiple overloads to handle common use cases.  I really been working hard on my Maybe<T> implementation and I think I have something really useful.  Some of my co-workers have made use of it, and it has been proving very useful in making us more productive and precise in our development efforts.

I want to know what you think of it.  Is it useful?  How could you use it?  How would you improve the implementation?  If you want to help answer those questions, download the iSynaptic.Commons library off of NuGet.  Feel free to pull down the source from Github.  You’ll notice that I’ve used Maybe<T> within the library to implement other features, so with that and the tests, there should be some decent examples.  I’m eager to know what you think.  Comment on this post if you have some feedback.

In the near future I hope to post some more on this subject, specifically more detail on all the different Maybe<T> operators.

Comments (2) -

May 23. 2011 10:38

Dilip

Very nice! I do remember Wes Dyer coming up with something similar (Maybe monad) in C#. It was a blog post from a couple of years ago that I can't seem to find right now.

Dilip

May 23. 2011 19:09

Jordan

Yes, I remember that article: blogs.msdn.com/.../the-marvels-of-monads.aspx

It was one of the resources that I used when I was learning about monads.  He did create a basic implementation of the Maybe monad, but I've really tried to make a production quality one (not to say that Wes couldn't write one, he just didn't in that post).  Plus, the operators are what really starts to make the Maybe monad shine (IMHO).

Jordan

Comments are closed