Programming – Page 10 – Jay R. Wren

Fluent Interfaces

I like to embrace Languages Inside of Languages. Jeff says “they are actually an ugly hack, and a terrible substitute for true language integration.”

Jeff totally misses the point. His first example of regular expression does indeed replace one line regular expression with ten lines of objects, methods and named enumerations. Jeff asks if this is progress.

It depends on your goal. What if you make a typo in the one line regular expression? You will never know it until you realize that things don’t work. My goal is to leverage the languages at my disposal to a maximum benefit.

A fluent interface gets me a certain degree of type safety. This means I can

catch errors at compile time instead of debug time.
write fewer unit tests to get the same amount of coverage.
utilize tools such as intellisense when writing to an API instead of strings.

Sure, LINQ is awesome. VB.NET’s XML Literals are awesome too. Jeff is ignoring what a regular expression fluent interface brings to the table: catching errors in your code.

Next Jeff attacks Subsonic’s fluent interface. It looks a lot to me like what I have used, and loved, in API output form NHibernate Query Generator.

Jeff’s example is great. Lets look at it again

SELECT * from Customers WHERE Country = "USA"
ORDER BY CompanyName

Is approximately the SQL generated by Subsonic’s fluent interface, given this code:

CustomerCollection c = new CustomerCollection();
c.Where(Customer.Columns.Country, "USA");
c.OrderByAsc(Customer.Columns.CompanyName);
c.Load();

Jeff says this is harder to write. Jeff is wrong.

By using native C# code instead of embedding a SQL query in some string, we gain

type safety
use of code completion features like intellisense or intellassist
ability to pass around queries and partial queries without modifying them with StringBuilder

That last point should be obvious to anyone who has used DetachedCriteria in Hibernate/NHibernate. If you aren’t sure what I mean, go look at DetachedCriteria or better yet go look at the code that NHQG creates.

So, what do you do as a developer? I encourage you to listen to neither Jeff or me. Make your own decision given the circumstances. I definitely prefer to leverage the compiler and type safety wherever possible. However, I also write Perl when a task at hand can best be accomplished with that tool. Jeff seems to be overlooking some serious benefits to Fluent Interfaces. Please don’t overlook them yourself.

Learning F#: Terminology

I like knowing what to call things. I learned a couple of new definitions.

When you bind a new value to an existing value name this is called outscoping.

let funName a =

    let funName n =

        a+n

    funName 5

funName 1

funName with parameter n outscopes funName with parameter a.

Tuples have a special case for the 2-tuple. Dustin Campbell touched on this, but it seems to me that anytime we know a tuple is a 2-tuple it would be beneficial to refer to that tuple as a pair. A pair is a 2-tuple. Dustin showed that pairs are so special that the fst and snd functions can be used on them. These two functions work on pairs, not all tuples.

open Math

let maxProjectileDistanceVector = (Math.Sqrt(2.0)/2.0, Math.Sqrt(2.0)/2.0)

maxProjectileDistinceVector is a pair. Its a tuple too, but a pair is more specific.

Order of Operation Matters Thanks to Rounding Error

My previous post was supposed to be followed up much quicker than this. Oh Well.

If that was The Good Stuff Is Hidden, then this is The Hidden Good Stuff Can Hurt You.

Joe Landman of “thanks for the awesome meeting space for MDLUG at SGI” fame had an excellent post on floating point arithmetic.

His post is what gave me the idea to do the same thing with Enumerable methods in the previous post.

I’d like to show that Joe’s assertion that programmers should worry about rounding error is true for .NET developers especially in this world of forthcoming parallel LINQ.

Lets write our own Range method which takes a start, end and increment parameters. You could use Enumerable.Range(start,count) and Enumerable.Reverse(this source). But reversing an Enumerable so big doesn’t sound like a good idea to me.

My brain things Range start & end, not start & count. I blame Python. This means the Range implementation is a little long because I try to detect infinite ranges.

public IEnumerable<int> Range(int start, int end, int increment)

{

    int i = start;

    if (start.CompareTo(end) <= 0 && increment > 0)

        while (i.CompareTo(end) <= 0)//(i <= end)

        {

            yield return i;

            i = i + increment;

        }

    else if (start.CompareTo(end) >= 0 && increment < 0)

        while (i.CompareTo(end) >= 0)//(i <= end)

        {

            yield return i;

            i = i + increment;

        }

    else

        throw new ArgumentOutOfRangeException();

}

Now lets use this method to count and sum the same exact same values twice. Once from start to end and again in a different order from end to start.

float x = 0f, y = 0f;

int N = 100000000;

x = Range(1, N, 1).Aggregate(x, (a, b) => a + 1 / (float)b);

y = Range(N, 1, -1).Aggregate(y, (a, b) => a + 1 / (float)b);

Console.WriteLine(“[increasing] sum = {0}”, x);

Console.WriteLine(“[decreasing] sum = {0}”, y);

This outputs the following:

[increasing] sum = 15.40368
[decreasing] sum = 18.80792

Not what you would expect is it? But before we go there…

I find the F# version of this to be easier and more readable.

let si = {1..100000000}

let f a b = a+1.0f/b

let i = Seq.fold f 0.0f (Seq.map (fun a-> (float32 a)) si)

let sd= StandardRanges.int 100000000 (-1) 1 //{100000000..1}

let d= Seq.fold f 0.0f (Seq.map (fun a-> (float32 a)) sd)

printf “[increasing] sum = %f\n” i

printf “[decreasing] sum = %f\n” d

This outputs the same :

[increasing] sum = 15.40368
[decreasing] sum = 18.80792

Now the question… why aren’t these numbers the same. It is the result of adding all of the same numbers.

Well if you read Joe Landman’s post that I linked above, then you already know. Order of operations matters! Floating point rounding error accumulates and it adds up big here. See Joe’s post for more.

When I read this, the thought that PLINQ might be easy quickly left my mind. Results changing just because the order is different!

Now what is worse, lets say you try to do the same thing in F#, but rather than using integers and casting like we did in C#, we just use F#’s StandardRanges library to generate floating point values for us.

let n = 100000000.0f

printf “[increasing] sum = %f\n” (Seq.fold (fun a b -> a+1.0f/b) 0.0f (StandardRanges.generate 0.0f (+) 1.0f 1.0f n))

printf “[decreasing] sum = %f\n” (Seq.fold (fun a b -> a+1.0f/b) 0.0f (StandardRanges.float32 n -1.0f 1.0f))

The problem here is the way the floating point numbers are represented is slightly different than the above integer cast version. The result is disappointingly different:

[increasing] sum = 15.40368
[decreasing] sum = Infinity

Infinity? Maybe I’ll have to look into this some more.

Some interesting things to note:

StandardRanges had some issues in 1.9.2.9 Thanks to Dustin Cambell for trying the same code which ran fine on his 1.9.3.7 and making me upgrade.
F#’s Seq is just an alias for IEnuemerable<> but it is augmented with more things.
examining the source in prim-types.fs was awesome. thanks for the source Microsoft.
Looks like prim-types.fs and StandardRanges has changed a bit from 1.9.2.9 to 1.9.3.7 and there is some float specialty stuff in there now. I need to investigate.
float in F# is double in C#. float32 in F# is float in C#. or if you remember you SATs : C#float:F#float32::C#double:F#float

Wondering about using a 64bit type instead of a 32bit type? Scroll up and click my link Joe Landman’s post. It doesn’t make this issue go away. We need to actually think about what we are doing as programmers *gasp*.

Winning the big CodeMash prize

About five minutes after I recorded with Jeff and Chris, I walked past Dustin in the hall and went and sat down next to Steven. About two minutes later Dustin called me and said “Jeff is looking for you, you won.” I said huh?

Turns out it won Jeff Blankenburg’s GIANT CODEMASH BLOGGING CONTEST OF COOL.

The Ogio bag custom embroidered with the Visual Studio 2008 logo is the nicest backpack I’ve ever used. It fits a Dell Latitude D820 and a Compaq nc6400 with plenty of room for Sennheiser HD25SP headphones, power adapters, USB flash drives, pens, mice, apples, oranges, chez-its, and grape soda.

The fleece jacket means that I’m a hero I guess. I’m not sure whose. Maybe my daughter?

The backpack was LOADED with all the stuff that Jeff said and

Windows Home Server
Windows Server 2003 Enterprise
Expression Studio

I have so much new software I need to buy a new computer on which to run it all!

Thanks Jeff!

CodeMash Interview

One of the last things I did at CodeMash before winning the big prize was tag along with Jeff McWherter to interview with Chris Woodruff for the CodeMash podcast.

I spouted off some nonsense about Orson Scott Card’s Ender series and leading software developers. Give this a listen for a real laugh.

As Dustin Campbell already pointed out, there are a ton of great interviews there.

with Dustin himself.
Microsoft Regional Director, C# MVP and Jedi Master Bill Wagner
Sara Ford, CodePlex PM and author of the ultra-popular Visual Studio 2008 Tip of the Day blog series

We do get into some good discussion on “the community” some of which I posted about previously.

CodeMash 2008 is in the can

What an amazing time! I lost count of the number of deep conversations I had that actually helped me reformulate and solidify my thoughts on countless things.

Things that stand out are: predicting one of Joe O’Brien‘s answers during the pre-conference experts panel. I don’t remember the question, nor the answer. What I do remember is seeing Joe grab for the mic, and thinking “I’ll bet Joe is going to say this!”. I leaned over to Steven Bak, who I was standing next to and said so. It turned out, I was right!

When I talked to Joe the next night over drinks, we had some excellent conversation about the differences between the java, ruby and .NET communities. As a member of the .NET community – but one who follows some of the java and python happenings – this conversation with Joe and Michael Letterle really helped me solidify and rethink the .net community and some of the “alt.net” stuff.

My current thinking. The Java and most other language communities are led by the open source leaders. Contrast this to the .NET world where the community is largely led by Microsoft. IMO this alt.net movement is largely an acknowledgement of this difference. The solution is easy. We need more non-Microsoft community leadership. What do I mean by this? After all, Microsoft already has programs in place to encourage this kind of activity. Basically these are the MVPs and RDs across the US and the world. These are great programs but they are limited.

In the Java world there is no Sun endorsed community leadership programs (are there?). In the Ruby and Python world there is no corporation or there are many of them, but its the community, not some giant entity like MS or Sun that shape the future of those languages.

So far I haven’t really mentioned any problems. Much of the discussion on the alt.net various mailing lists seem to suggest that some problems exist. I don’t think that they do. There is plenty of room for a healthy MS led community the coexist separately, but not independently, from the growing MS-independent community. It already exists. It already happened. When Roy Osherove first posted his “HOT and NOT” list in the context of alt.net, it could almost be mapped to a list of Microsoft endorsed technologies and a list of “other”.

There very splendid thing about this situation is that there is plenty of room for both. So, I’ve given up caring. I’ll encourage anyone to talk about and promote anything which they feel works for them and works for others. Anything which makes us better as developer is a good thing!

Getting back to CodeMash, this is why I was very happy to be able to present CastleProject at CodeMash. I apologize to anyone who saw my talk. I thought I could pull of talking about all of Castle as a whole. Maybe one could do it, but I could not. My talk was disorganized and I rambled a bit. That said, I still think that looking at CastleProject as a whole is VERY useful.

Even if you don’t use ActiveRecord, MonoRail or Windsor (I like to call these the big three), there is probably something in Castle which you would find useful. DictionaryAdapter alone is something that any .NET developer would benefit from at least knowing about. Being able to toggle between log4net and NLog without recompiling is a single compelling reason to use castle’s logging library. I’ve used both and IMO each one has its areas of excellence, so I do see a need to be able to toggle between the two.

My brain is still whirling from all the CodeMash conversations. If I can formulate any more thoughts into anything more coherent than the above, I’ll try to do that. No camera this year. I decided against photos. I promise that no matter what anyone tells you, there was more to CodeMash than this:

The Castle talk is shaping up and I’m excited for CodeMash

I previously mentioned it here. With the time at hand it bares mentioning that CodeMash is going to be a rocking good time.

I can’t wait to hear

Dustin Campbell talk about F# right after
Dianne Marsh talks about Scala. Talk about a functional and object oriented good time!
Bill Wagner is going to be talking about implementing IQueryProvider in a talk by the same name prefixed by “LinqTo<T>”. Holy crap! I don’t know how you can fit that into a ~70min talk! I mean, I have seen my fair share of LINQ talks to the point I am tired of hearing about it. I feel like I know the topic well enough now, but a talk on making your own LINQ to <your name here>? THAT IS AWESOME!
I know I won’t be able to make it to all of ’em but the language buff in me wants to see the Groovy, and JRuby talks too.

There is way too much good content and the fact that there are five tracks means I only get to see one fifth of the total content. That is assuming I go to something every time slot, which I already know to NOT be the case.

Diving into some of the edge cases for this CodeMash CastleProject presentation has been a lot of fun. I hear about NHibernate talks the most. IMNSHO Castle’s ActiveRecord is the best way to get started using NH. I hear about any Castle talk going on at user groups or whatever even more rarely. Those are usually just MonoRail or ActiveRecord. I’m excited to dive into Windsor, Dynamic Proxy(not too deep), and even some components and services and especially validation. Those oft overlooked Castle areas are my friends.

Post-CodeMash I plan on diving into F#. I didn’t want to do it, but Dustin Campbell convinced me when he showed me F#’s Pipeline operator. “|>” As a long time Linux and Unix shell user I do tend to think in pipes. This put me over the edge on my desire to learn F#. So in the tradition of learn a new language every year, 2008 will be the year of F# for me. (Deep inside I know it should be erlang, so I may never forgive Dustin for what he has done to me. He has ruined my 2008 and it is still early January.)

See you at CodeMash

ASP.NET Dynamic Data is scaffolding

Hey SaJee, ASP.NET Dynamic Data CTP is cool. I agree.

It may be spelled “Dynamic Data”, but its pronounced scaffolding.

If you have never used Rails, I highly recommend giving it a try and using its scaffolding and then trying ASP.NET ~~Dynamic Data~~Scaffolding and coming to your own conclusions. 🙂

By the way, MonoRail has had Scaffolding a number of different ways in the past. For a while, it was entirely reflective, like Rails 1. Then that fell out of favor and a generator was used instead, like Rails 2. The current state of things (I think) is that a different generator than the original is now used, and the reflective scaffolding has been resurrected through refactoring.

Least Squares Regression with help of LINQ

I received requirements from a customer that said something about Least-Squares Regression something or another. Having not looked at anything but the most elementary statistics since around 2001, I realized that I needed to refresh myself.

I did some background reading and played with LINEST in Excel (that is a whole ball of insanity : http://support.microsoft.com/kb/828533 ). Tonight I decided to implement something in C#. Even if I don’t use it for this particular client, I will have a better understanding of what I am doing.

First, I asked my wife for help. As a biologist and geneticist she is more trained in statistics than I am. She didn’t know off the top of her head, but she pulled out her book which I then dove into.

I had the crazy idea of testing my implementation using the examples straight out of Introduction to the Practice of Statistics.

I started writing Test First.

[TestFixture]

public class LeastSquaresTests
{
    readonly double[] nonExerciseActivityIncrease = { -94, -57, -29, 135, 143, 151, 245, 
                            355, 392, 473, 486, 535, 571, 580, 620, 690 };
    readonly double[] fatGain = { 4.2, 3.0, 3.7, 2.7, 3.2, 3.6, 2.4, 
                   1.3, 3.8, 1.7, 1.6, 2.2, 1.0, 0.4, 2.3, 1.1 };
    private IEnumerable<Point> getPoints()
    {
        return from i in Enumerable.Range(0, nonExerciseActivityIncrease.Length )
               select new Point(nonExerciseActivityIncrease[i], fatGain[i]);
    }
    [Test]
    public void XYTest()
    {
        var result = getPoints().GetLeastSquares();
        Assert.AreEqual(324.8, nonExerciseActivityIncrease.Average(), 0.1);
        Assert.AreEqual(324.8, result.Xmean, 0.1);
        Assert.AreEqual(257.66, result.XstdDev, 0.01);
        Assert.AreEqual(2.388, result.Ymean, 0.001);
        Assert.AreEqual(1.1389, result.YstdDev, 0.0001);
        Assert.AreEqual(-0.7786, result.Correlation, 0.0001);
        Assert.AreEqual(-0.00344, result.Slope, 0.00001);
        Assert.AreEqual(3.505, result.Intercept, 0.001);
    }
}

My code wouldn’t compile. This is what Test First is all about. Then I implemented things.

public static class LeastSquares
{
    public static LeastSquaresResponse GetLeastSquares(this IEnumerable<System.Windows.Point> points)
    {
        var xMean = (from p in points select p.X).Average();
        var yMean = (from p in points select p.Y).Average();
        var xVariance = (from p in points select Math.Pow(p.X – xMean, 2.0)).Sum() / (points.Count() – 1);
        var yVariance = (from p in points select Math.Pow(p.Y – yMean, 2.0)).Sum() / (points.Count() – 1);
        var xStdDev = Math.Sqrt(xVariance);
        var yStdDev = Math.Sqrt(yVariance);
 
        var correlation = (from p in points select ((p.X – xMean) / xStdDev) * ((p.Y – yMean) / yStdDev)).Sum() / (points.Count() – 1);
 
        var slope = correlation * yStdDev / xStdDev;
        var intercept = yMean – slope * xMean;
 
        return new LeastSquaresResponse(xMean, 
                   yMean, 
                   xVariance,
                   yVariance, 
                   xStdDev, 
                   yStdDev, 
                   correlation, 
                   slope, 
                   intercept);
    }
}

I find the LINQ syntax to be very readable for these types of math operations. I definitely like aggregation better than using for-loops.

I declared a type LeastSquaresResponse which returns everything that is calculated when Least-Squares is computed. This is really just for testing sake. I wanted to be able to test all the values.

For completeness, I’ve incluced the definition of LeastSquaresResponse. Maybe next year I will do Analysis of Variance (ANOVA), but probably not.

public class LeastSquaresResponse
{
    private readonly double xmean;
    public double Xmean
    {
        get { return xmean; }
    }
    private readonly double ymean;
    public double Ymean
    {
        get { return ymean; }
    }
    private readonly double xvariance;
    public double Xvariance
    {
        get { return xvariance; }
    }
    private readonly double yvariance;
    public double Yvariance
    {
        get { return yvariance; }
    }
    private readonly double xstdDev;
    public double XstdDev
    {
        get { return xstdDev; }
    }
    private readonly double ystdDev;
    public double YstdDev
    {
        get { return ystdDev; }
    }
    private readonly double correlation;
    public double Correlation
    {
        get { return correlation; }
    }
    private readonly double slope;
    public double Slope
    {
        get { return slope; }
    }
    private readonly double intercept;
    public double Intercept
    {
        get { return intercept; }
    }
 
    /// <summary>
    /// Initializes a new instance of the LeastSquaresResponse class.
    /// </summary>
    /// <param name=”xmean”></param>
    /// <param name=”ymean”></param>
    /// <param name=”xvariance”></param>
    /// <param name=”yvariance”></param>
    /// <param name=”xstdDev”></param>
    /// <param name=”ystdDev”></param>
    /// <param name=”correlation”></param>
    /// <param name=”slope”></param>
    /// <param name=”intercept”></param>
    public LeastSquaresResponse(double xmean, double ymean, double xvariance, double yvariance, double xstdDev, double ystdDev, double correlation, double slope, double intercept)
    {
        this.xmean = xmean;
        this.ymean = ymean;
        this.xvariance = xvariance;
        this.yvariance = yvariance;
        this.xstdDev = xstdDev;
        this.ystdDev = ystdDev;
        this.correlation = correlation;
        this.slope = slope;
        this.intercept = intercept;
    }
}

LINQ Syntax Cheat Sheet

Someone over at ASP.NET Resources made up a System.Linq.Enumerable extension methods cheat sheet and called it a LINQ Standard Query Operators Cheat Sheet.

I like it, but I find Intellisense does the trick for me on IEnumerable extension methods. I do tend to forget the LINQ syntax.

I made up a LINQ Syntax Cheat Sheet.