December 2007 – Jay R. Wren – lazy dawg evarlast

ASP.NET Dynamic Data is scaffolding

Hey SaJee, ASP.NET Dynamic Data CTP is cool. I agree.

It may be spelled “Dynamic Data”, but its pronounced scaffolding.

If you have never used Rails, I highly recommend giving it a try and using its scaffolding and then trying ASP.NET ~~Dynamic Data~~Scaffolding and coming to your own conclusions. 🙂

By the way, MonoRail has had Scaffolding a number of different ways in the past. For a while, it was entirely reflective, like Rails 1. Then that fell out of favor and a generator was used instead, like Rails 2. The current state of things (I think) is that a different generator than the original is now used, and the reflective scaffolding has been resurrected through refactoring.

Book Lists

Ben Carey made me do it.

http://22books.com/lists/show/92/Life

I have to say I’m a little surprised at my own list of books that changed my life. I’m also disappointed by how short it is, but I guess I just haven’t read that many life changing things.

Not life changing, but definitely entertaining are John Scalzi’s Old Man’s War and the sequel The Ghost Brigades. I finally read The Ghost Brigades last week and IMO it is every bit as good as the first.

Currently, I am reading Orson Scott Card’s Shadow of the Hegemon, Napoleon Hill’s Think and Grow Rich and Marc Weissbluth’s Healthy Sleep Habits, Happy Child (I swear I’ll get this back to Dianne sooner or later.)

Least Squares Regression with help of LINQ

I received requirements from a customer that said something about Least-Squares Regression something or another. Having not looked at anything but the most elementary statistics since around 2001, I realized that I needed to refresh myself.

I did some background reading and played with LINEST in Excel (that is a whole ball of insanity : http://support.microsoft.com/kb/828533 ). Tonight I decided to implement something in C#. Even if I don’t use it for this particular client, I will have a better understanding of what I am doing.

First, I asked my wife for help. As a biologist and geneticist she is more trained in statistics than I am. She didn’t know off the top of her head, but she pulled out her book which I then dove into.

I had the crazy idea of testing my implementation using the examples straight out of Introduction to the Practice of Statistics.

I started writing Test First.

[TestFixture]

public class LeastSquaresTests
{
    readonly double[] nonExerciseActivityIncrease = { -94, -57, -29, 135, 143, 151, 245, 
                            355, 392, 473, 486, 535, 571, 580, 620, 690 };
    readonly double[] fatGain = { 4.2, 3.0, 3.7, 2.7, 3.2, 3.6, 2.4, 
                   1.3, 3.8, 1.7, 1.6, 2.2, 1.0, 0.4, 2.3, 1.1 };
    private IEnumerable<Point> getPoints()
    {
        return from i in Enumerable.Range(0, nonExerciseActivityIncrease.Length )
               select new Point(nonExerciseActivityIncrease[i], fatGain[i]);
    }
    [Test]
    public void XYTest()
    {
        var result = getPoints().GetLeastSquares();
        Assert.AreEqual(324.8, nonExerciseActivityIncrease.Average(), 0.1);
        Assert.AreEqual(324.8, result.Xmean, 0.1);
        Assert.AreEqual(257.66, result.XstdDev, 0.01);
        Assert.AreEqual(2.388, result.Ymean, 0.001);
        Assert.AreEqual(1.1389, result.YstdDev, 0.0001);
        Assert.AreEqual(-0.7786, result.Correlation, 0.0001);
        Assert.AreEqual(-0.00344, result.Slope, 0.00001);
        Assert.AreEqual(3.505, result.Intercept, 0.001);
    }
}

My code wouldn’t compile. This is what Test First is all about. Then I implemented things.

public static class LeastSquares
{
    public static LeastSquaresResponse GetLeastSquares(this IEnumerable<System.Windows.Point> points)
    {
        var xMean = (from p in points select p.X).Average();
        var yMean = (from p in points select p.Y).Average();
        var xVariance = (from p in points select Math.Pow(p.X – xMean, 2.0)).Sum() / (points.Count() – 1);
        var yVariance = (from p in points select Math.Pow(p.Y – yMean, 2.0)).Sum() / (points.Count() – 1);
        var xStdDev = Math.Sqrt(xVariance);
        var yStdDev = Math.Sqrt(yVariance);
 
        var correlation = (from p in points select ((p.X – xMean) / xStdDev) * ((p.Y – yMean) / yStdDev)).Sum() / (points.Count() – 1);
 
        var slope = correlation * yStdDev / xStdDev;
        var intercept = yMean – slope * xMean;
 
        return new LeastSquaresResponse(xMean, 
                   yMean, 
                   xVariance,
                   yVariance, 
                   xStdDev, 
                   yStdDev, 
                   correlation, 
                   slope, 
                   intercept);
    }
}

I find the LINQ syntax to be very readable for these types of math operations. I definitely like aggregation better than using for-loops.

I declared a type LeastSquaresResponse which returns everything that is calculated when Least-Squares is computed. This is really just for testing sake. I wanted to be able to test all the values.

For completeness, I’ve incluced the definition of LeastSquaresResponse. Maybe next year I will do Analysis of Variance (ANOVA), but probably not.

public class LeastSquaresResponse
{
    private readonly double xmean;
    public double Xmean
    {
        get { return xmean; }
    }
    private readonly double ymean;
    public double Ymean
    {
        get { return ymean; }
    }
    private readonly double xvariance;
    public double Xvariance
    {
        get { return xvariance; }
    }
    private readonly double yvariance;
    public double Yvariance
    {
        get { return yvariance; }
    }
    private readonly double xstdDev;
    public double XstdDev
    {
        get { return xstdDev; }
    }
    private readonly double ystdDev;
    public double YstdDev
    {
        get { return ystdDev; }
    }
    private readonly double correlation;
    public double Correlation
    {
        get { return correlation; }
    }
    private readonly double slope;
    public double Slope
    {
        get { return slope; }
    }
    private readonly double intercept;
    public double Intercept
    {
        get { return intercept; }
    }
 
    /// <summary>
    /// Initializes a new instance of the LeastSquaresResponse class.
    /// </summary>
    /// <param name=”xmean”></param>
    /// <param name=”ymean”></param>
    /// <param name=”xvariance”></param>
    /// <param name=”yvariance”></param>
    /// <param name=”xstdDev”></param>
    /// <param name=”ystdDev”></param>
    /// <param name=”correlation”></param>
    /// <param name=”slope”></param>
    /// <param name=”intercept”></param>
    public LeastSquaresResponse(double xmean, double ymean, double xvariance, double yvariance, double xstdDev, double ystdDev, double correlation, double slope, double intercept)
    {
        this.xmean = xmean;
        this.ymean = ymean;
        this.xvariance = xvariance;
        this.yvariance = yvariance;
        this.xstdDev = xstdDev;
        this.ystdDev = ystdDev;
        this.correlation = correlation;
        this.slope = slope;
        this.intercept = intercept;
    }
}

LINQ Syntax Cheat Sheet

Someone over at ASP.NET Resources made up a System.Linq.Enumerable extension methods cheat sheet and called it a LINQ Standard Query Operators Cheat Sheet.

I like it, but I find Intellisense does the trick for me on IEnumerable extension methods. I do tend to forget the LINQ syntax.

I made up a LINQ Syntax Cheat Sheet.

VS2k8 and VS2k5 so close… so far away…

Expanding on some thoughts which I put in reply to http://igloocoder.com/archive/2007/12/12/opening-vs2k5-.net-2.0-projects-in-vs2k8.aspx

The issue isn’t that Visual Studio 2008 (VS2k8) has “sweet compiler sugar that allows you to use the new language features in .NET 2.0, 3.0 or 3.5”.

VS2k8 is ALWAYS using the C# 3.0 compiler. PERIOD. done.* C# 3.0 code ALWAYS compiles down to IL which will run on the 2.0 runtime (there is no newer runtime).

So the “2.0, 3.0, 3.5” combo box is really just a filter for what assemblies you reference. Have you referenced WPF? Then studio (msbuild) will warn you that this is not a 2.0 target, and to use 3.0. Trying to use LINQ or something else from System.Core? You will get a warning to use 3.5.

I’ve done this, and I was (like the igloo coder) a little surprised at what I saw. Then I went and clicked “3.0” and things worked. But it was just an assembly reference that did it. It is nice if you have to deploy to Windows 2000 servers or workstations which only have 2.0 runtime and cannot move to 3.0, or if you target XP machines which can’t install 3.0 or 3.5. But it isn’t the ultimate in 2005 to 2008 migration for which some of us were hoping.

What is a bummer is that all the nice languages features such as auto properties and use of ‘var’ will not ever work in VS2k5. So even though the project format is portable, its a bit useless. This bit me too. I had left a project set at 2.0, and I just started coding in 2008. That damn ‘var’ implicit typing just saves on keystrokes so much that I use the heck out of it. The next day I openned up the 2k5 solution, which openned the same project and now my project didn’t compile. OOPS. Luckily I hadn’t defined any new types using the nice property syntax. I would have cried given all the extra typing I had to do. My fingers hurt from this rant.

* Ok, so ASP.NET can be told which compiler to use, but arguably that isn’t VS doing the compile then, it is ASP.NET doing the compile. Maybe there is an MSBuild target which I can specify which will let me use the old 2.0 compiler so that my use of ext methods, var and property syntax is caught?

David Siegel, will you be my new lover?

Rick Harding just linked me to this post that pretty much sums up the way I feel about Mono.

http://blog.davebsd.com/2007/12/08/early-criticism-of-gnome-do/

I don’t make the choices that David makes regarding free software. I’ve been using Linux for over 12 years now. I like it. It does a lot of things really great. I run Windows Vista on my laptop. I like it. It does a lot of things really great.

I write code in C# and boo for the .NET platform because I feel that .NET is the best development platform in existence today. This means that in the Linux world, Mono is the best development platform in existence today. My choices are purely personal based on my understanding of the underlying technology. If something were to come out which gave me compelling reason to change my mind, I would develop using that other thing.

That other thing is not Ruby, Python, D or any variants. That other thing doesn’t run on the JVM.

I’m going to go double check some boo code which I run using Mono now. Thanks.

The Good Stuff Is Hidden

Intellisense is great, but it is no substitute for reading documentation. I’m always upset when I hear developers say that their primary source of learning is looking at what intellisense offers them.

There are a few methods which can be considered part of LINQ, which aren’t extension methods. This means that unless you actually type out Enumerable, you will never see then in intellisense. The nature of extensions methods means that you won’t ever type out enumerable, so these methods are less discoverable than others.

System.Linq.Enumerable is where the library implementation of LINQ to objects lives. Empty, Range and Repeat methods are hand tools.

Empty is useful for those cases when you don’t want a null object, but instantiating an Enumerable is wasteful. For example, creating a new empty List of string to pass to a linq expression when some other data source may return null or empty. I’m thinking of combining linq expressions here. The anti-wasteful programmer might say “well, just use a new empty string array”. And that programmer would be right. This would use less memory than a new List. But the really nice part about Empty is that its underlying implementation is just a generic singleton based on type. That means instead of 1000 or 100,000 “new string[0]” in your code, eating up a tiny bit of ram (ok, its pretty small here, I know, but doesn’t every little bit help?) you have only one empty string array that all cases can share.

Range should be familiar to anyone coming from python. It just returns an enumerable range of numbers from start to end. IMO it isn’t as nice as pythons, but I can always write my own. Still not clear? Here is how you would write a summary from one to one hundred million of n-1 + 1/i.

var x = Enumerable.Range(1, 100000000).Aggregate(0f, (a, b) => (float)(a + 1 / (float)b));

That is:

Finally, Repeat is similar to range, it just gives an enumerator which yields the same object/value repeatedly.

Example of using repeat to make 5 hello worlds:

var fiveWorlds = Enumerable.Repeat(“Hello World”, 5);
Console.WriteLine(fiveWorlds.Aggregate((a,b)=>a + “\n” +b ));