jordan.terrell
Just trying to make sense of things...

Library Stability and Semantic Versioning

Wednesday, 1 June 2011 08:54 by jordan.terrell

One of the challenges of incubating a new open source software project is figuring out when to make it public.  If you look back at previous posts, you’ll see that I struggled with that a bit.  It took well over a year after first talking about my Commons library to even make the source available.  With the availability of NuGet, it has made it supremely easy to publish bits to the world.  Almost too easy.

So, I’ve been giving some thought to how I want to manage releases of the Commons library.  As with most fledgling open source projects, the first steps are unstable and unpredictable.  This is often indicated by a “0.x.x” version number, stating that this project is not stable.  This is the case with the Commons library.  That said, it is “fairly” stable.  Much of the code contained in the library has been there in its current form for some time.  The areas that have some measure of flux is the implementation of Maybe<T> and Exodata.

Maybe<T> has in the last few days gone through some breaking changes, primarily because of feedback from Brian Beckman himself.  It’s been an absolute thrill to grab his attention and collaborate on this implementation.  After writing such a lengthy post, I had hoped that Maybe<T> would stay mostly static, with little teaks and bug fixes, but Brian suggested some changes that have dramatically improved the implementation of Maybe<T>.  I anticipate that there will be some more changes in this area as I continue to collaborate with him.  I will be putting a post together in the near future outlining these changes and details of the process we went through to get there.  That said, the code is available if you’d like to inspect the changes to date, and I will probably be releasing a NuGet package soon to reflect these changes.

Exodata has been static for a little while, but I’ve got some ideas to experiment with before I slap a v1.0 label on the bits.  I haven’t yet blogged about what Exodata is or how to use it, so as expected there hasn’t been feedback driving change in this area.  Expect this to change – I’m already putting a post together to talk about Exodata.

Library stability, or rather instability, can often scare people away from using open source bits when they are in their infancy.  I will do my best to communicate breaking changes when they are coming during the pre-v1.0 time period.  However, once I hit the v1.0 mark, the rules of Semantic Versioning will be adhered to.  In fact, they are being adhered to now, seeing that it allows for breaking changes prior to 1.0.

I want this library to be something immensely useful for developers who appreciate a functional approach to developing software.  I too am a consumer of this library and will expect stability from it in the future.  I will hold the Commons library to the same standards as we would expect from any other library.

Maybe – The Uh, Stuff That Dreams Are Made Of

Sunday, 22 May 2011 17:31 by jordan.terrell

UPDATE 12/10/2011: I’ve continued development of the Maybe<T> structure and accompanying operators.  As a result, some of the information in this post is no longer accurate.  That said, the information is still useful as it conveys many of the concepts and practical applications of the Maybe monad.  A future post will highlight the changes in my implementation and some of the lessons I’ve learned since writing this post.



Explorers (1985) film - one of my favorite childhood movies. As I said, I’ve been having a blast learning more about programming in the functional style.  In fact, over the past 18 months I’ve rewired my brain to think functionally-oriented first and object-oriented second – not something I expected to happen.

Something that I came across was the concept of Monads.  Monads are everywhere in .NET now and it is a foundational concept in many of the really useful libraries and APIs that are coming out of Microsoft (LINQ, Task Parallel Library (TPL), Reactive Extensions (Rx), etc.).  I strongly recommend you start learning about Monads. That said, I’m not going to even try to explain Monads today.  I’m going to talk very practically about the Maybe monad, why you would use it, and my implementation of this pattern.

IEnumerable<T> - It’s Just A Function

How do you feel about the IEnumerable<T> interface?  Well, you may appreciate the fact that LINQ to Objects heavily relies on this interface and enables you to do some really cool things in just a few lines of code.  Many a programmer has taken a portion of a program that has numerous nested foreach loops with nested if statements and turned them into a really concise and readable LINQ statement. Thank you functional programming.

If you think about it, IEnumerable<T> is really just a function.  Yes, I know, it’s an interface.  But please suspend reality for a few moments and think of it like a function.  What does a function do?  It takes some input, and returns some output.  It represents a computation – the ability to compute some value in the future.  Many programming languages allow functions to return at most a single value.  What if we want to return more than one value?  We create data structures that allow us to return multiple values as a single atomic value, for example, an array.  However, arrays require that you compute all the values in the array before can return it from a function.  What if you wanted a function that calculates all prime numbers?  If we use an array, it would never return because we would need an array of infinite size.  Fortunately, we have IEnumerable<T> and iterators (that enable deferred execution) to help us with this.  So we can write a method, IEnumerable<int> GetPrimes(), and the object it returns (which implements IEnumerable<int>) we can think of it as being a function that can compute zero or more prime numbers at some point in the future when we enumerate over it (i.e. “execute” the function).

Part of that last sentence is what I want you to focus on - “compute zero or more”.  Looking at IEnumerable<T> generically, we could say that it represents the ability to compute zero or more values regardless of what type T is.  We can represent computations that potentially need to return more than one value.  That in and of itself is useful.  However, it becomes extremely powerful when you combine that interface with LINQ’s Standard Query Operators (e.g. Select, Where, GroupBy, Aggregate, etc).  Now we are able to express the intent of a fairly complex piece of logic in just a few lines of code.  We are expressing what we want the program to do and not the monotonous detail of how we want to do it. We are programming declaratively and with composition.  It is more maintainable because it is more readable and thus easier to reason on.  Plus, because we aren’t spelling out all the details, framework developers and compiler designers are able to make some intelligent decisions about how to implement the programmers intent, for example, running parts of the program in parallel (e.g. PLINQ).  No doubt you will agree there are many benefits to programming this way.

Zero or One

But what if we don’t want to compute zero or more values?  What if we want to compute zero or one value?  A classic example is reading from a dictionary.  Dictionary<TKey, TValue> has a TryGetValue() method that attempts to retrieve a value from the dictionary by key, and, if the value is not in the dictionary, it returns false.  Unfortunately because you need to return two bits of information (1. was the value in the dictionary, and 2. if so, what is the value), TryGetValue() uses awkward output arguments to return the value if it was in the dictionary.  Put in different words, we need a function (TryGetValue) to compute (lookup) zero or one value from the dictionary based on some key.

Another example might be retrieving data from a database.  Perhaps you have some function that returns a single CustomerDto from the database when you give the function a customer Id.  Well, we know we have to handle the scenario when there is no record in the database for the Id provided.  Often we just return null and that may be a perfectly logical way to represent zero customers with that customer Id.  However, we now have a “special” reference to a CustomerDto object – the null reference.  How many times has the dreaded NullReferenceException visited you?  To protect ourselves against that, we sprinkle in “null checks” all over the place.  It would be nice if we didn’t have to do that.  Nullable<T> is nice because it communicates a little better that “this value could be null” by generally requiring you to use the Value property (and hopefully you check the HasValue property before you use it).  Unfortunately, by design Nullable<T> does not work with reference types. That is because it is trying to model a null value (which reference types already have) and not the absence of a value.  Subtle difference, I know, but go with it for now.

In both of the examples above, we are conceptually trying to represent some computation (a function) that can compute (return) zero or one value.  IEnumerable<T> gives us the ability to represent zero or more values, but that is not what we want.  We need some type that represents zero or one value.

Introduction Maybe<T>

That type is Maybe<T>.  It works with all .NET types; value and reference types.  Here are some simple examples:

   1: Maybe<int> number = Maybe.Value(42);
   2: Maybe<string> text = Maybe.Value("Hello, World!");
   3:  
   4: if (number.HasValue) { Console.WriteLine(number.Value); }
   5: if (text.HasValue) { Console.WriteLine(text.Value); }
   6:  
   7: number = Maybe<int>.NoValue;
   8: text = Maybe<string>.NoValue;

"The uh, stuff that dreams are made of." - Quote from Maltese Falcon (never seen it) that was referenced in Explorers.

First, we see a simple integer value (42) placed into a Maybe<T> and assigned to the variable, “number”.  Next, we assigned another constant to a variable, “text”, but this time with a string, demonstrating that this works with both value and reference types.  Then, we write the contents of both variables out to the console, but only if they have a value (which we know they do in this case).  Finally, we re-assign the two variables to not have a value (NoValue), just to demonstrate how to compute the absence of a value.

At this point you might be thinking, “Big deal. I can just use null reference types and Nullable<T> to do the same thing.”  You would probably be right. However, many developers who have spent time with the Maybe monad pattern have found it to be a much more elegant solution.  For one, in my implementation, you will never get a null reference exception using Maybe<T> - it is implemented as a value type (struct) and as such cannot be null.  Second, the way to represent the absence of a value (zero) for any type, regardless of whether it is a value or reference type, is the same when you are using Maybe<T>.  Consistency is important for code readability and maintainability.  Finally, having a variable of type Maybe<T> communicates that it is possible for it to not have a value (‘no T for you…maybe’).

Having a variable that is a reference type, for example a string, doesn’t tell you whether or not you should expect it to contain the null reference.  If you are being a defensive programmer, you might check to see if it is null before using it - just in case.  Also, when you have a method that returns a reference type, the return type doesn’t tell you whether or not the method can or will return a null.  By using Maybe<T>, we can use the type system to clearly indicate that the method might not be able to compute (return) a value.  A great example of this is one I mentioned earlier – using dictionaries:

   1: var dictionary = new Dictionary<string, int>();
   2: Maybe<int> number = dictionary.TryGetValue("Foo");


I created an extension method, TryGetValue, that extends IDictionary<TKey, TValue> and takes only one argument - the key to use to lookup the value.  We know this lookup could fail.  The item we are looking for may not be in the dictionary and it would be better if we didn’t throw an exception if we can’t lookup the value.  That is why the TryGetValue extension method I created returns a Maybe<TValue> - if the value cannot be found in the dictionary it just returns Maybe<TValue>.NoValue.  The nice thing about using the Maybe<T> in an extension method is that it works with any type. It doesn’t matter if TValue is a value or reference type.  You don’t have to create an extension method for values types, and another one for reference types.

Did you notice we don’t have the awkward output arguments in the last example?  Maybe<T> is almost worth it just for that.  However, if you are still on the fence, I just might be able to convince you that Maybe<T> is “the uh, stuff that dreams are made of”.  Well, maybe just that it is really cool and useful.

Maybe Is Lazy and Doesn’t Forget

One nuance when using IEnumerable<T> is that you need to realize that the values it returns often are not computed until you enumerate over, also known as deferred executionMaybe<T> works the same way.  It often does not compute it’s value or the absence of a value until you try to use the HasValue, Value, or Exception (more on this in a moment) properties. A symptom of deferred execution is when you write some code with IEnumerable<T> or Maybe<T> expecting some side effect to occur and nothing happens.  Most of the Maybe<T> operators are lazy and if you want to force evaluation you can call Run() or RunAsync() (operators are discussed below).

Now, so far you’ve only seen examples that are do not use deferred execution.  Here is an (albeit contrived) example of deferred execution using Maybe<T>:

   1: Maybe<int> number = Maybe.Value(() => 42);

 

The Value method lets you pass in a Func<T> to calculate it’s value. In this example it only returns 42.  However, it could return something that is expensive to compute.  The function you provide will not get executed until something uses the HasValue, Value, or Exception properties – just like a deferred IEnumerable<T> does not execute until you call MoveNext() on the enumerator.

However, unlike IEnumerable<T>, once you use one of those properties your function’s results are cached and the function is never called again.  The Maybe<T> remembers its value.  It behaves more like Lazy<T> in this case.  In the vast majority of scenarios where you would use Maybe<T> this is the behavior that you want.  If you want to re-compute the value, build up the Maybe<T> again.

Exceptions ≈ NoValue

Sometimes when you are trying to compute some value bad stuff happens.  Exceptions are thrown.  One way to look at an exception in this scenario is that it is roughly equal to NoValue.  Example: You tried to get a row count from a database table, but the database is down and an exception is thrown.  You were not able to compute the row count because the database is not available.  Maybe<T> has an Exception property that allows you to capture the exception that prevented you from computing the row count. For example:

   1: Maybe<int> constant = new Maybe<int>(new InvalidOperationException());
   2: Maybe<int> lazy = Maybe.Value<int>(() => { throw new InvalidOperationException(); });

 

In both cases, the HasValue property is equal to false and the Exception property contains the exception.  In the interest of full disclosure, this feature is not part of the traditional implementation of the Maybe monad.  It is actually part of the Error monad.  That said, these monadic patterns are so complementary that it is useful to combine them into a single implementation.

Maybe<T> Entry Points

There are a few ways to create instances of Maybe<T>.  First, you can use constructors that take in a T, a Func<T>, or an Exception.  You can also use the Maybe.Value method overloads that take in a T or a Func<T> – this often allows the compiler to infer what type T is.  There are a few other ways to create a Maybe<T>.

Maybe.NotNull – This creates a Maybe<T> that treats null values as NoValue.  This is very helpful later when you are using other operators.  It has support for reference types and Nullable<T>NotNull can be used both as an entry point and as an operator (explained below).

Maybe.Using – This creates a Maybe<T> that can dispose of a resource after a value has been computed using the resource.  A classic example would be executing some scalar query against a database.  You want to ensure that the connection to the database is disposed after the scalar query is executed. Using can be used both as an entry point and as an operator.  Here is an example of using both NotNull and Using together:

   1: Maybe<int> rowCount = Maybe
   2:     .NotNull(connectionString)
   3:     .Using(cns => ConnectToDb(cns), conn => conn.GetRowCount());
 

Maybe<T> Operators

Remember how LINQ Standard Query Operators (e.g. Select, Where, GroupBy, etc) make the IEnumerable<T> interface so much more powerful?  Well, the same is true of Maybe<T>.  I’ve developed a number of really powerful Standard Maybe Operators that make Maybe<T> just as powerful as IEnumerable<T>.  Keep in mind, all of these operators understand that sometimes there is no value to operate on.  Just like calling Select and Where against an empty IEnumerable<T> results in another empty IEnumerable<T>, calling Select and Where against a Maybe<T> that doesn’t have a value results in another Maybe<T> that doesn’t have a value.

I won’t be able to fully explain each operator in this post. In fact, I’m going to intentionally gloss over them quite a bit, but I have unit tests covering them so you can look there for examples.  Additionally, the examples below are written to make them easier to understand and not always for conciseness. I won’t use the var keyword, and sometimes I will create temporary variables to help comprehension.  I may end up dedicating a few blog posts to expanding on how each of these operators behave in detail, but for now this should give you an idea of what is possible.  Here is the current list of the Standard Maybe Operators:

NotNull, Using – These were explained above.  They both can be used as an operator and an entry point method.

Select, CoalesceSelect should be fairly obvious if you’ve done any LINQ development.  It takes the value, if there is one, and selects another value from it.  Coalesce combines Select with NotNull.  Example:

   1: Maybe<Category> parentCategory = Maybe.Value(product)
   2:     .Coalesce(x => x.Category)
   3:     .Select(x => x.Parent);

 

Where, UnlessWhere will also be familiar if you’ve done LINQ development before.  It takes a predicate function (Func<T, bool>) as an argument that is called with the value of the Maybe<T>, if there is a value, and if the function returns true it returns the original value.  If the function returns false, it returns NoValueUnless works roughly the same, except if flips the result of the predicate function you provide.  It takes a function (Func<T, bool>) as an argument that is called with the value of the Maybe<T>, if there is a value, and if the function returns false it returns the original value. If the function returns true, it returns NoValue.  Example:

   1: Maybe<string> message = Maybe.Value("Hello, World!")
   2:     .Where(x => x.StartsWith("H"))
   3:     .Unless(x => x.EndsWith("?"));

 

Or, JoinOr and Join allow you to combine two Maybe<T>s.  Or will return the value of the first Maybe<T>, unless it has no value, and then it will return the value of the second.  Join returns a Maybe that combines the two Maybes – either as a Maybe<Tuple<T, U>> or you can give it a function that takes the two values and combines them into a custom type.  Join returns NoValue if either of the two provided Maybes does not have a value. Example:

   1: Maybe<CustomerDto> cachedDto = dictionary.TryGetValue(id);
   2: Maybe<CustomerDto> retreivedDto = Maybe.Using(() => ConnectToDb(), conn => conn.GetCustomer(id));
   3:  
   4: Maybe<StockQuote> latestQuote = Maybe.Value(() => stockService.GetQuote("LNKD"));
   5:  
   6: Maybe<CustomerQuote> customerQuote = cachedDto.Or(retreivedDto)
   7:     .Join(latestQuote, (dto, quote) => new CustomerQuote(dto.Name, quote.Price));
   8:  
   9: if(customerQuote.HasValue)
  10: {
  11:     CustomerQuote q = customerQuote.Value;
  12:     Console.WriteLine("Quote for LinkedIn to {0} is: {1}", q.Name, q.Price);
  13: }
  14: else
  15:     Console.WriteLine("Unable to retreive quote.");

 

With – Executes an action against a selected value when evaluated. Example:

   1: Maybe<string> text = Maybe.Value("Hello, World!")
   2:     .With(x => x.Length, l => Console.WriteLine("Length of string: {0}", l));

 

When – Executes an action or replaces a value, on evaluation, when predicate is true. Example:

   1: Maybe<int> nearestOddNumberRoundingUp = Maybe.Value(() => Console.ReadLine())
   2:     .Select(x => int.Parse(x))
   3:     .When(x => x % 2 == 0, x => x + 1);

 

OnValue, OnNoValue, OnException – Executes an action or replaces a values, on evaluation, when the method’s condition is true. Example:

   1: Maybe<int> fortyTwo = Maybe<int>.NoValue
   2:     .OnNoValue(() => { throw new InvalidOperationException(); })
   3:     .OnException(ex => Maybe.Value(42))
   4:     .OnValue(x => Console.WriteLine(x));

 

ThrowOn, ThrowOnNoValue, ThrowOnException – Generally, the Maybe<T> operators prevent exceptions that are thrown from escaping the Maybe<T>, but instead treat the exception roughly like NoValueThrowOn, ThrowOnNoValue, ThrowOnException allows exceptions to immediately escape and bubble up to be handled by standard .NET exception handling. Example:

   1: Maybe<int> neverHaveValue = Maybe.Value("Hello, World!")
   2:     .Select(x => int.Parse(x))
   3:     .ThrowOnException();

 

Run, RunAsync – Since most operators don’t execute until the Maybe<T> is evaluated, Run and RunAsync force evaluation while still returning a Maybe<T> for further processing.  Both allow you to pass in an Action<T> to immediately execute some work on the value (if there is one).  RunAsync makes use of the Task Parallel Library to evaluate the Maybe<T> by default on the thread pool.  It optionally takes in a CancellationToken, TaskCreationOptions, or TaskScheduler to provide additional control over how it runs asynchronously.

   1: Maybe<int> writeToConsoleImmediately = Maybe.Value(() => Console.ReadLine())
   2:     .Select(x => int.Parse(x))
   3:     .OnValue(x => Console.WriteLine("You entered number: {0}.", x))
   4:     .OnException(ex => Console.WriteLine("You did not enter a number."))
   5:     .Run();

 

Synchronize – Although Maybe<T> only evaluates once just like Lazy<T> it is not thread-safe by default.  This is by design because there is overheard in thread synchronization.  If two or more threads try to evaluate the Maybe<T> at the same time, evaluation will execute multiple times.  To prevent this, use the Synchronize method. This is useful when evaluation is very expensive; for example, querying a database or doing some CPU intensive work.  Example:

   1: Maybe<BigInteger> = Maybe.Value(() => GetLargePrimeNumber())
   2:     .Synchronize();

 

Cast, OfTypeCast and OfType allow you to cast the value (if there is one) to another type.  OfType tries to safely cast the value; if it can’t, it returns NoValueCast tries to directly cast the value; if it fails, it results in a Maybe<T> that contains a cast exception. Example:

   1: object obj = "Hello, World!";
   2: Maybe<string> text = Maybe.Value(obj)
   3:     .Cast<string>();
 

Shedding Maybe<T>

At some point you are going to need to get at the underlying value.  There are a few ways of doing this.  All of the methods below evaluate the Maybe<T> immediately.

ToNullable – When you are dealing with Maybe<T> where T is a value type, ToNullable will convert the Maybe<T> into a Nullable<T>. If there is a value, it simply returns the value typed as Nullable<T>. If there is no value, it returns the null value typed as Nullable<T>.  Example:

   1: Nullable<int> value = Maybe<int>.NoValue
   2:     .ToNullable();

 

Assign – Assign will output the value (if there is one) thru an reference parameter. This is useful if, after calling Assign, you want to execute other Maybe<T> operators.

   1: int value = 0;
   2: Maybe<int> number = Maybe.Value(() => Console.ReadLine())
   3:     .Select(x => int.Parse(x))
   4:     .OnException(42)
   5:     .Assign(ref value);

 

Return – If the Maybe<T> has a value, Return simple returns the value.  If there is no value, Return returns the type’s default value (e.g. 0 for an integer).  Alternatively, you can provide a default value it should return if the Maybe<T> contains no value.  If the Maybe<T> contains an exception, the exception is thrown.

   1: int value = Maybe.Value("27")
   2:     .Select(x => int.Parse(x))
   3:     .Return(42);

 

HasValue / Value – The most obvious way to get the value out of the Maybe<T> is to use the Value property.  If there is no value, the Value property throws an InvalidOperationException.  If the Maybe<T> already contains an exception, it re-throws the exception.  Only if there is a value does the Value property succeed.  I recommend you check the HasValue property before using the Value property.

Call To Action

Many of the methods that I’ve discussed have multiple overloads to handle common use cases.  I really been working hard on my Maybe<T> implementation and I think I have something really useful.  Some of my co-workers have made use of it, and it has been proving very useful in making us more productive and precise in our development efforts.

I want to know what you think of it.  Is it useful?  How could you use it?  How would you improve the implementation?  If you want to help answer those questions, download the iSynaptic.Commons library off of NuGet.  Feel free to pull down the source from Github.  You’ll notice that I’ve used Maybe<T> within the library to implement other features, so with that and the tests, there should be some decent examples.  I’m eager to know what you think.  Comment on this post if you have some feedback.

In the near future I hope to post some more on this subject, specifically more detail on all the different Maybe<T> operators.

So Where Was I?

Thursday, 19 May 2011 21:16 by jordan.terrell

Like I said, never underestimate the power of a well written regular expression!

Yes, I know – I said that over a year ago.  Quite obvious that I haven’t been blogging for a while.  Just needed a break and I felt like I had run out of things to say.  Well, I’m back and I’ve got some good things to talk about.

The last year has been crazy for me.  A lot in my personal life and a lot in my professional life.  I changed employers in May of 2010 and that has been going unbelievably well.  In many respects, I’ve found as close to my dream job as possible, without having to work an insane number of hours or move to the west coast.

My employer is supportive of employees creating and contributing to open source projects, so I have made some wonderful strides on my iSynaptic projects.  The Commons library has really been going through a refinement period and has some real gems.  You can find it in the NuGet gallary.  In the past year, I’ve been learning a ton about programming languages in general and a lot about functional programming specifically.  A common theme in my learning about functional programming has been Monads.  At some point in the future I just might dare to write a blog post on Monads, but it would likely be more heavily focused on the resources I used to learn about and apply Monads.  Speaking of application, the Commons library has an implementation of the Maybe monad, Maybe<T>.  This concept has changed the way that I think about and write code.  I will be dedicating at least one blog post to the pattern and my implementation, and I also plan on giving a talk at the next Twin Cities Code Camp on it (if they’ll have me!).

One other piece of functionality that I am particularly happy with is Exodata.  The idea came from an extremely talented co-worker, but the implementation is all mine.  I’ve been using the tagline “Ioc for Data”, mostly to grab the attention of others, however I think it is so much more than that.  This too will be the subject of a blog post or two, and perhaps a Code Camp talk as well.

One other thing – I got to develop a major feature in NUnit - Action Attributes!  It was an idea that I had been sitting on for a while and even contemplated writing a testing framework in order to implement.  Fortunately, when I pitched the idea to the NUnit team they seemed to like the idea and it is targeted for the 2.6 release of NUnit.

So stay tuned! Here are some things that I plan to talk about:

  • Maybe<T>
  • Exodata
  • NUnit Action Attributes
  • Command Query Responsibility Segregation (CQRS)
  • Event Sourcing
  • Messaging
  • Task Parallel Library (TPL) and TPL Dataflow
  • Reactive Framework
  • Distributed Source Control
  • General Functional Programming Concepts
Categories:   .NET | Programming
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

Never Underestimate A Well Written Regular Expression

Tuesday, 6 April 2010 11:36 by jordan.terrell

A couple of weeks ago, Kirill Osenkov posted an interview question that got the attention of a few .NET developers, myself included.  Like a moth to a flame, we were all eager to present a solution to this interview question:

In a given .NET string, assume there are line breaks in standard \r\n form (basically Environment.NewLine).

Write a method that inserts a space between two consecutive line breaks to separate any two line breaks from each other.

Roughly 20 answers were given in the comments to Kirill’s post, some with subtle differences, some with completely different approaches.  Some were even written in F#.

When I first saw the interview question, I very quickly came to my answer:

   1: string output = Regex.Replace(input, @"(\r\n)(?=\r\n)", "$1 ");

My answer uses Regular Expressions, which is a concise language to search text, sometimes in complex ways.  I first became aware of Regular Expressions in 2004, and I was immediately enamored with them.  I had always written complex search functions using operations like IndexOf() or directly accessing character arrays.  Text searching always seemed slow to me, but I later realized that it was just my poorly written code.  Very quickly I dove into learning the Regular Expression language (at least the dialect in .NET), and found many uses for it.

I’ve found since then that many developers are either unaware or fearful of Regular Expressions.  I’ll admit, some expressions that I’ve seen look very cryptic and intimidating.  But they are very powerful.  Plus they have the benefit of usually being very fast (although you can write slow ones).

During the discussion in the comments of Kirill’s post, it became obvious that performance is something to be considered in such a routine.  As a result, Rik Hemsley commented that he had created a benchmarking test bed to run each suggested solution.  Here are the results:

image

My solution came out as the best performing.  I say this, not to gloat - because in reality I’m just using what Microsoft wrote for us to use. I say it because I wanted to show that knowing about and using Regular Expressions is important when you need to parse text.  I’m sure that someone could come up with a better performing solution, but for a one-liner, Regular Expressions are hard to beat.

If you interested in learning about Regular Expressions in .NET, the MSDN documentation is pretty good.  Plus there are books that you can read on them as well as sites that have examples.

State Machine ViewModel

Friday, 19 March 2010 08:09 by jordan.terrell

Jason Bock recently wrote a post on turning WPF Binding errors into exceptions.  When I first started to get into WPF development, I too found that the binding system was significantly better (i.e. actually worked) that Microsoft’s previous attempts to do UI binding.  That said, I also found, as Jason did, that WPF’s binding system was less than helpful when you fat-fingered a binding.  I did learn that there we many ways that you could get debugging output from the binding system, but I never took it to the level that Jason did and turned the binding errors into exceptions.  I’m sure I’ll find his binding extension useful in the future.

Sometimes Invisible Binding Errors are Okay

However, I did make a comment on Jason’s post that later on I found a use for not having exceptions be thrown on data binding issues errors.  There is a pattern that I’ve used on one or two WPF applications where I host a collection of “screens”, which are just UserControls, in a single Window, and the visibility of the UserControls is driven by the DataContext of the Window.  The DataContext is both a ViewModel in the Model-View-ViewModel pattern, and an implementation of the State pattern.

The difference is that, in a traditional State pattern implementation, you typically have a base class or interface that defines all of the input or operations that all the states need to respond to.  However, with a state-based ViewModel and WPFs dynamic binding system you will see this is not necessary.

Let’s start with a simple scenario, a wizard-like, step-by-step WPF application for taking simple feedback comments from users.  Let’s say the list of states is: AskForComment, CaptureName, CaptureEmail, CaptureText, and Thanks.  Clicking the next button will transition from one state to the next, changing the DataContext of the Window to the new state.  As the state changes, the WPF binding system will be notified of this.  It will attempt to rebind the visibility of all the UserControls to the new DataContext, which is the new state, and depending on the state, different UserControls will be visible.  The UserControls that are not visible, will have binding errors, but since they are not visible and thus not in use, the errors can be ignored.  The user can happily enter information into the newly visible control, unaware of any errors.

I’ve created a sample application that shows one way this could be implemented. If you look at the MainWindow.xaml, you will see five UserControls that represent the different views as the states change.  Each of them has their Visibility property bound to the current state through a StateVisibilityConverter.  Inside each UserControl, various elements are bound to properties on the DataContext that may or may not exists, depending on what state you are in – but again, that doesn’t matter – if the properties don’t exist, the UserControl should not be visible.

The key takeaway is that sometimes you can use the lack of exceptions on WPF binding errors to your advantage, and I would venture a guess that Microsoft decided not to throw exceptions on binding errors to enable scenarios like this.

Taking a Break

Thursday, 28 January 2010 14:22 by jordan.terrell

Just wanted to let everyone know I’m taking a short break from blogging right now (if you haven’t already guessed).  However, I will return!

Categories:   General
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

“M” and Oslo’s Future

Wednesday, 11 November 2009 11:37 by jordan.terrell

If you’re at all interested in Oslo, you may be looking forward to the 2009 PDC to see the direction it is going to take.  As of yesterday, we got a small preview of that direction – and to be honest, without having all the nitty-gritty details that I hope will come out of the PDC, I’m concerned and disappointed – and I’m not the only one who feels this way.

“M” Interacting with the Database – Please Don’t!

I understand that using DSLs and modeling is an excellent way to capture and manipulate data that can be used by applications.  However, this quote from Doug’s post is what concerns me (emphasis is my doing):

Time after time we heard that “M” would make interacting with the database easier

The “M” language and tooling should have absolutely nothing to do with interacting with the database.  The fact that Microsoft has heard “time after time” that people would like to use “M” to interact with the database strikes me as a problem with many people not understanding what I believe “M” was envisioned to do and should be used for.  “M” (MGrammer, MGraph, MSchema) and its supporting tooling should be about the definition and runtime representation of models and languages used to create and manipulate instances of models.  It is my strong opinion that this functionality should have no direct dependency on databases or database interaction.  The core foundational value I saw in Oslo was a shared platform providing:

  • A DSL definition language (MGrammer)
  • A lowest common denominator representation of a model (MGraph)
  • Model schema definition and validation (MSchema)
  • Tooling (Intellipad, m.exe, possible VS integration, etc…)

What is so sadly ironic is that Microsoft recognized this from the beginning.  During many presentations at the 2008 PDC, the bond between developers and text and text editors was mentioned.  Microsoft knew they needed to have a first class story to tell when it came to text, and that was how “M” was born.

The Repository and Quadrant

My opinion of Repository is that it is a mistake to try to tackle Repository with the first release of Oslo.  Even during the 2008 PDC presentations it felt like the Repository was a solution in search of a problem.  I understand the value in being able to use models to define runtime execution characteristics of an application (e.g. HTML, XAML, WCF Service Descriptions, etc), but how many have you seen that execute their models from a SQL Server database?!?!? There might be a small class of applications where it makes sense to store and execute a model from a database, but my guess is that more often than not a model would either be stored in or transformed into something that looks nothing like a database.  Perhaps it would be embedded into an application redistributable as code or an embedded resource or persisted as a file.  Perhaps it is never persisted.  Regardless, if the model is to be persisted, that should be a separate responsibility.  Repository is a “nice to have”, but honestly I can’t see using it much, if at all.

As for Quadrant, I don’t feel versed enough in Quadrant’s capabilities to voice a strong opinion.  I do see value it having a common tool for visualizing and manipulating models.  However, I would spend my “development dollars” less on Quadrant for Oslo’s first release, and more on the “M” language and tooling.

Concerned, but Hopeful

Douglas has made it clear that there is more information coming about the “M” and DSL story.  I for one hope that “M” can stand alone from Repository, Quadrant, SQL Server, and anything to do with databases.  If this is not the case, I hope the Oslo team hears loud and clear that it should make some fundamental changes.

However, if “M” does stand alone, this post and the comments from others should help to keep it that way – even when “time after time” people associate “M” with database interaction.

Tags:   ,
Categories:   Programming | .NET
Actions:   E-mail | del.icio.us | Permalink | Comments (2) | Comment RSSRSS comment feed

Scott Chacon – Git Documentation Master

Friday, 30 October 2009 08:48 by jordan.terrell

As I’ve continued my investigation into Git, a name kept appearing whenever I found excellent Git documentation.  That name is Scott Chacon.  This is a person who favors concepts over commands, and I immensely appreciate his efforts in demystifying Git.  I’ve found three resources that he, to one extent or another, has authored – all of which are great.

Of the three resources, for someone starting out I would recommend the Git Internals PDF.  It’s concise and simple to understand, and I’ve already seen one person who, more or less was anti-Git, read it in an hour and become pro-Git.  The other two resources are excellent as well.

I have to hand it to Scott, he sure knows his Git.  He just also knows how best to teach it.

Tags:  
Categories:   Programming
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

iSynaptic.Commons Lives

Friday, 30 October 2009 08:35 by jordan.terrell

The last time I talked about iSynaptic.Commons was in July of 2008.  Basically I said that I was putting the release on hold.  Well, I’ve finally released it under the MS-PL license.

My reason for releasing it is by no means because it is complete.  It is no where near there.  I’ve got a lot of work yet to do on it, and I’ve got many things that I want to completely change or remove.  So my recommendation is that you don’t go use the code in a project you intend for it to keep stable, because it is going to be a moving target for now (i.e. there will be breaking changes).  Especially the Text.Parsing namespace and the Xml namespace.  Those were very naive and simplistic, and I will be replacing.

However, my primary reason for releasing it is to get feedback.  I don’t expect much at first, but I want it to be something I can talk about in this blog, and solicit direct feedback about specific parts of the framework.

Finally, I talked about iSynaptic.SolutionBuild and iSynaptic.Modeling in my previous iSynaptic.Commons post.  SolutionBuild exists and is in partially working order.  I will, at some point, be working on moving this into a public repository.  I may even change SolutionBuild to be based on PowerShell, and possibly psake.  As for the Modeling project, I’ve re-envisioned it under a different project I’m calling iSynaptic.Core.   Currently Core doesn’t exists (other than the repository being stubbed out), but what I wanted to accomplish with the Modeling project will be wrapped into the Core project.  Core will depend on Commons (Commons being a lower-level framework).  Likely, some of what is in Commons will be pushed into Core – I want to keep Commons lightweight, but useful.  Besides rolling Modeling into Core, details on what Core will be is something for a future post.  I don’t necessarily want to talk about it, until I have something to show.

Initially, my activity on all of these initiatives will be light.  I have numerous projects that are work related that I need to complete, and other non-development personal obligations as well.  This is just me taking the first step…

Fowler’s DSL Book Milestone

Wednesday, 21 October 2009 14:08 by jordan.terrell

Martin Fowler just updated his roadmap for his upcoming DSL book.  I can’t wait to get it!

Tags:   ,
Categories:   Programming
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed