Welcome to WindowsClient.net | Sign in | Join

June 2008 - Posts

So I was reading Angry Hacker's 10 reasons why SQL 2008 is going to rock and one of the reasons was compression, and one of the things about compression was that:

For instance, length of the varchar will be stored in 3 bits.

That bothered me. 3 bits represent a value from 0 to 7. How can that store all the possible lengths of a varchar, up to 8192? I kind of had the problem running in background all last night (at the same time as learning Ruby and watching the BBC's Bill Gates documentary; who says men can't multitask?). Finally, this morning, I came up with a theory: it's a lookup table.

Looking at the database I primarily work with, we only actually use a handful of different varchar sizes. 80% of the text columns are varchar(254), due to a limitation of a legacy development system's ODBC implementation. There are a few 50s where people probably just forgot to change the default in the table designer. For some reason there are a lot of 40s. And there are a bunch of isolated odd lengths, which I'll probably do something about if purchasing approve my request for VSTS Database Edition. Point is, >95% of the columns are covered by the top 7 values. So those 7 values could be held in an int[7] somewhere, and the actual metadata stored as the index to that table. The few extraordinary values are stored as regular ints (you use the remaining 3-bit value to indicate that happening).

Of course, that's just my database. I don't know what other people's databases are like. But Microsoft do. Microsoft have got the Customer Experience Improvement Program. I'd always wondered about the kind of information that they collected; I'm guessing that distributions of data types and lengths is in there somewhere. They've got enough information about enough usage patterns to be able to do stuff like this. So now I'm thinking about adding my own CEIP to my company's applications, just to see if there are any obvious optimisations waiting to stare me in the face.

Disclaimer: as I say, this is purely guesswork. But if anyone from the SQL team reads this, maybe you could let me know whether I'm warm or freezing?

Jared Parsons posted about using enums as a simplified version of the adapter pattern. In his code, each time the SomeAction method is called, it runs a switch against the private enum value and calls on to the relevant method.

Since C# 3.0 came along, with its lambdas and functional-style abilities, I've gotten used to thinking of methods as variables, which presents another way to do this sort of thing. Rather than having a public method, you can expose a delegate through a public property, and set the delegate according to the required behaviour. Using this method, Jared's example could look like this:

class Example
{
    public Example(Kind kind)
    {
        switch (kind)
        {
            case Kind.Kind1:
                this.SomeAction = ActionForKind1;
                break;
            case Kind.Kind2:
                this.SomeAction = ActionForKind2;
                break;
            case Kind.Kind3:
                this.SomeAction = ActionForKind3;
                break;
            default:
                throw new InvalidOperationException("Invalid Kind");
        }
    }

    public Func<int> SomeAction { get; private set; }

    private int ActionForKind1() { return -1; }

    private int ActionForKind2() { return 0; }

    private int ActionForKind3() { return 1; }
}

I think that's just as readable. This technique can be used for more dynamic behaviour switching. Next time, I'll post a very simple technique for creating a delegate wrapper for long-running deterministic functions which caches in memory. Once you've got that, you can use this delegate property technique for switching between the cached and non-cached versions.

 
Page view counter