lessCode.net
Links
AFFILIATES
Wednesday
Mar302011

PROCESSOR_ARCHITECTURE and Visual Studio Debugging

For years I’ve labored under the impression that the PROCESSOR_ARCHITECTURE environment variable was an indication of the bitness of the current process. Even after comprehending that in a 64-bit process this variable yields AMD64, even when the machine is fitted with an Intel processor (because AMD invented the 64-bit extensions), I’ve written several components in the past that rely on PROCESSOR_ARCHITECTURE, and never expended too much thought about it. I mean, this has to be the simplest and most reliable way to do it, right?

However, a colleague recently asked me if I had any idea why he couldn’t debug an assembly in Visual Studio. He was getting a BadImageFormatException, a surefire indicator that something was trying to load a 64-bit binary into a 32-bit process (or vice-versa). The code he was trying to debug in this case was an add-in to Excel, so he’d configured the project in Visual Studio to “Start external program”, and pointed to Excel, but Excel wouldn’t start when hitting F5, and the logs revealed the exception. Starting Excel directly with the add-in configured worked just fine.

As it turned out, we were using PROCESSOR_ARCHITECTURE in this assembly to determine which version to load (x86 or x64) of a lower-level native dependency. In production, this would work just fine, but when debugging from Visual Studio, the x64 version of the native component was being loaded, which was not making the x86 Excel process very happy at all.

To simplify the diagnosis, I created a new console application in Visual Studio, set the project to debug into C:\Windows\SYSWOW64\cmd.exe (the 32-bit command prompt), and hit F5. Sure enough, from the resulting shell, echo %PROCESSOR_ARCHITECTURE% was yielding AMD64, in a 32-bit process! Something was clearly not quite right with this picture.

As it transpires, when debugging this way in Visual Studio, the external program starts in an environment that’s configured for the wider of the bitnesses of the startup project’s current build platform and the started executable. In my simple console app, if I set the platform to x86 instead of the default Any CPU, the resulting command prompt now yielded x86 instead of AMD64 for PROCESSOR_ARCHITECTURE. But if I instead debugged into C:\Windows\System32\cmd.exe, PROCESSOR_ARCHITECTURE yielded AMD64 regardless of the project’s current build platform.

Since the code in question here was in a managed assembly, I switched the PROCESSOR_ARCHITECTURE check to instead consider the value of IntPtr.Size (4 indicates x86, 8 indicates x64). Hopefully when we go to .NET 4.0 we’ll be able to take advantage of Environment.Is64BitProcess and Environment.Is64BitOperatingSystem for this kind of thing, at least from managed code.

Sunday
Feb062011

Windows Home Server RIP?

I’ve been running an HP MediaSmart Windows Home Server for a while now, and I’ve actually found it quite useful. There are many PCs in my house (laptops, desktops, home theater PC and media center), and they’ve all been hooked up to the home server, saving me a bunch of pain and hassle.

The PCs back up every day – the whole drive. I can restore a PC from bare metal very quickly (I’ve used this feature on two occasions when hard drives have died). I’m told that this integrates with Time Machine on the Mac, too, but I don’t have a Mac.

Anti-virus on all PCs and the server stays up-to-date automatically thanks to the Avast! anti-virus suite, which has a version specifically for WHS.

The server itself hosts network shares for user files that are duplicated across multiple physical drives in the server.

Adding and removing drives on-the-fly is simple.

WHS has a plugin model that allows me to add Amazon S3 offsite storage as an additional backup for the shares.

I have a ton of media stored on the server, and it’s all streamable to the PCs and a few XBox 360 consoles (in Media Center Extender mode).

All of this is available remotely when I’m on the road via a secure web site.

Generally, I think this is one of the best products Microsoft has ever shipped, but several things have happened recently to make me think that the ride is over.

First, Microsoft have decided to kill the Drive Extender technology that’s at the heart of the storage and redundancy engine of WHS. This technology has a checkered history – it was destined to become the storage mechanism for all Windows Server versions, but exhibited some nasty bugs early on in life that caused data corruption. Those problems were fixed, and in my experience the technology has been very solid since then, but apparently more problems appeared during testing with several corporate server software products, and Microsoft has decided to take a different path. It looks like the next version of WHS will have to interoperate with a Drobo-like hardware solution instead.

Second, I was recently “volunteered” to give an internal presentation on Windows 7 Explorer, and during my research on this, I discovered that the new Libraries system in Windows 7 doesn’t quite work fully with Windows Home Server. Specifically, a Documents library, when asked to “Arrange by Type”, doesn’t include library locations on a Windows Home Server. The location is indexed (WHS Power Pack 3 is installed, Windows Search is installed and running), but the files on the WHS location simply don’t show up in “Arrange by Type” view (and, curiously, only this view, it seems). Indexed files on a traditional network share on a standard Windows Server 2003 show up just fine in this library view, so I think this is a WHS problem.

Third, I also noticed that files on a regular network share that I’d deleted months ago still appeared in the library in some view modes, but if I try to open them I get a “File Not Found” error. This appears to be a problem with Windows 7 Libraries, to be fair to WHS.

All of this fails to give me a warm and fuzzy feeling about the future of Windows Home Server.

Wednesday
Nov242010

Extending the Gallio Automation Platform

I’ve been doing a lot recently with Gallio, and I have to say I like it. Gallio is an open source project that bills itself as an automation framework, but its most compelling use by far is as a unit testing platform. It supports every unit test framework that I’ve ever heard of (including the test framework that spawned it, MbUnit, which is now another extension to Gallio), and has great tooling hooks for the flexibility to run tests from many different environments.

Gallio came onto my radar when my company had to choose whether to enhance or replace an in-house testing platform which was written back in the days when the unit testing tools available were not all that sophisticated. The application had evolved to become quite feature-rich, but the user interfaces (a Windows Forms application and a console variant), needed a makeover, and there were two schools of thought: invest resources in further enhancing and maintaining an internal tool, or revisit the current external options available to us.

We use NHibernate as a persistence layer, and have been able to benefit greatly from taking advantage of the improvements made to that open-source product, so I was keen to see if we could lean on the wider community again for our new unit testing framework.

I’d had a fair amount of experience with MSTest in a former life, but quickly dismissed it as an option when comparing it against the feature sets available in the latest versions of NUnit and MbUnit. Our in-house framework was clearly influenced by an early version of NUnit, and made significant use of parameterized tests, many of which were quite complex. It seemed that data-driven testing in MSTest hadn’t really progressed beyond the ADO.NET DataSource, whereas frameworks like MbUnit have started to provide quite powerful, generative capabilities for parameterizing test fixtures and test cases.

So one option on the table was to port (standardize, in reality) all of our existing tests to NUnit or MbUnit tests. This would allow us to run the tests in a number of ways, instead of just via the two tools we had built around our own framework. With my developer hat on, I really like the ability to run one or more tests directly from Visual Studio in ReSharper, and for the buildmaster in me, running and tracking those same tests from a continuous integration server like TeamCity is also important. We had neither of these capabilities with our existing platform.

Another factor in our deliberations became the feature set of the UI for our in-house tool. We had some feature requirements here that we would have to take with us going forward, so if we weren’t going to continue to maintain ours, we needed to find a replacement test runner UI that could be extended.

Enter Icarus, which itself is another extension to Gallio that provides a great Windows Forms UI that, all by itself, does everything you might need a standard test runner application to do, but that’s only the beginning. Adding your own functionality to the UI is actually quite easy to achieve. I was able to add an entire test pane with just a handful of source files and few dozen lines of code, and we were up and running with a couple of core features we needed from our new test runner (and this was an ElementHost-ed WPF panel, at that).

And that’s not the end of the story. Even if we wanted to use Gallio/Icarus going forward, we were still faced with the prospect of porting all of our existing unit tests to one of the many supported by Gallio (with NUnit and MbUnit being the two favourites). We really didn’t want to do this, and would probably have lived with a bifurcated testing architecture where the existing tests would have stayed with our internal framework and any new tests would be built for NUnit or MbUnit. This would have been less that ideal, but it probably would still have been worthwhile in order to avoid maintaining our own tools while watching the third party tools advance without us.

As it turns out, we didn’t need to make that choice, because adding a whole new test runner framework to Gallio is as easy as extending the Icarus UI. By shamelessly cribbing from the existing Gallio adapters for NUnit and MbUnit, we were able to reuse significant parts of our in-house framework, build a new custom adapter around those, and run all of our existing unit tests alongside new NUnit tests in both the Icarus UI and the Gallio Echo command-line test runners. As an added bonus, since Gallio is also supported by ReSharper, we were now able to run our old tests directly from within Visual Studio, for free, something we had not been able to do with our platform. It took about two days to complete all of the custom adapter work.

I’m quite optimistic that we’ll be able to really enhance our unit testing practices by leveraging Gallio, and without the effort it would take to maintain a lot of complex internal code. The extensibility of Gallio and Icarus is really quite phenomenal – kudos to all those responsible.

Saturday
May082010

Did the Entity Framework team do this by design?

I’ve been playing around quite a bit recently with Entity Framework, in the context of a WCF RIA Services Silverlight application, and just today stumbled upon a quite elegant solution to a performance issue I had feared would not be easy to solve without writing a bunch of code.

The application in question is based around a SQL Server database which contains a large number of high-resolution images stored in binary form, which are associated with other entities via a foreign key relationship, like this: image

Pretty straightforward. Each Thing can have zero or many Image entities associated with it. Now, lets say we want to present a list of Thing entities to the user. With a standard WCF RIA Services domain service, we might implement the query method like this:

public IQueryable<Thing> GetThings() {
return ObjectContext.Things.Include("Images").OrderBy(t => t.Title);
}

 

Unfortunately, this query will perform quite poorly if there are many Things referencing many large Images, because all the Images for all the Things will cross the wire down to the client. When I try this for a database containing a single Thing with four low-resolution Images, Fiddler says the following about the query:

 image

If we had a large number of Thing entities, and the user never navigates to those entities to view their images, we’d be transferring a lot of images in order to simply discard them, unviewed. If we leave out the Include(“Images”) extension from the query, we won’t transfer the image data, but also the client will not be aware that there are in fact any Images associated with the Things, and we’d have to make subsequent queries back to the service to retrieve the image data separately.

What we’d like to be able to do is include a collection of the image Ids in the query results that go to the client, but leave out the actual image bytes. Then, we can write a simple HttpHandler that’s capable of pulling a single image out of the database and serving it up as an image resource. At the same time we can also instruct the browser to cache these image resources, which will even further reduce our bandwidth consumption. Here’s what that handler might look like:

public class ImageHandler : IHttpHandler {
    #region IHttpHandler Members

    public bool IsReusable {
        get { return true; }
    }

    public void ProcessRequest(HttpContext context) {
        Int32 id;

        if (context.Request.QueryString["id"] != null) {
            id = Convert.ToInt32(context.Request.QueryString["id"]);
        }
        else {
            throw new ArgumentException("No id specified");
        }

        using (Bitmap bmp = ConvertToBitmap(GetImageBytes(id))) {
            context.Response.Cache.SetValidUntilExpires(true);
            context.Response.Cache.SetExpires(DateTime.Now.AddMonths(1));
            context.Response.Cache.SetCacheability(HttpCacheability.Public);
            bmp.Save(context.Response.OutputStream, ImageFormat.Jpeg);
            bmp.Dispose();
        }
    }

    #endregion

    private Bitmap ConvertToBitmap(byte[] bmp) {
        if (bmp != null) {
            TypeConverter tc = TypeDescriptor.GetConverter(typeof(Bitmap));
            var b = (Bitmap) tc.ConvertFrom(bmp);
            return b;
        }
        return null;
    }

    private byte[] GetImageBytes(Int32 id) {
        var entities = new Entities();
        return entities.Images.Single(i => i.Id == id).Data;
    }
}
Note that the handler queries Entity Framework on the server side to load the image bytes from the database, given the image Id that comes from the Url’s query string.
 
So, back the real problem. How do we avoid sending the image bytes back to the client when RIA Services queries for the Things and requests that the images be Included?
 
One way to achieve this would be to remove the Data property from the Image entity in the entity model. This won’t, of course, affect the database, but now since there is no way to access the image bytes, an Image will consist only of an Id. However, this means that we’d have to change our handler’s GetImageBytes method to retrieve the image from the database with lower-level database calls, bypassing Entity Framework. 

It seems like there’s no clean way to achieve what we want, but in fact there is. If you look at the properties available to you on the Data property you can see that the accessibility of these properties can be changed:

 

image

By default entity properties are Public, and RIA Services will dutifully serialize Public properties for us. But if we change the accessibility to Internal, RIA Services chooses not to do so, which make sense. Since the property is Internal, it’s still visible to every class in the same assembly. Therefore, as long as our ImageHandler is part of the same project/application as the entity model and the domain service, it will still have access to the image bytes via Entity Framework, and the code above will work unmodified.

After making this small change in the property editor (and regenerating the domain service), on the client side we no longer see a byte[] as a member of the Image entity:

image

When I now run my single-entity-with-four-images example, Fiddler gives us a much better result:

image

Friday
Jan082010

The Big Bang Development Model

Last summer, on the first hot day we had (there weren’t that many hot days last year in New York), I turned on my air conditioner to find that although the outdoor compressor unit and the indoor air handler both appeared to be working (fans spinning), there was no cold air to be felt anywhere in the house. We bought the house about three years ago, and at that time the outdoor unit was relatively new (maybe five years old). In the non-summer seasons we've spent in the house, we've always made sure keep the compressor covered up so that rain, leaves and critters don't foul up the works, and I've even opened it up a couple of times to oil the fan and generally clean out whatever crud did accumulate in there, so I was surprised that the thing didn't last longer than eight years or so.

When the HVAC engineer came to check it out, he found that the local fuse that was installed inline with the compressor was not correctly rated (the original installer had chosen a 60A fuse; the compressor was rated at 40A), and that the compressor circuitry had burned out as a result. For the sake of the correct $5 part eight years ago, a new $4000 compressor was now required.

So this winter, when I turned up the thermostat on the first cold day and there was no heat, I readied the checkbook again. This furnace was the original equipment installed when the house was built in the ‘60s, so I felt sure that I’d need a replacement furnace. I was pleasantly surprised when this time the HVAC guy told me that a simple inexpensive part needed to be replaced (the flame sensor that shuts off the gas if the pilot light goes out). This was a great, modular design for a device that ensured that a full refit or replacement wasn’t required when a single component failed.

It occurred to me that I'd seen analogs of these two stories play out on software projects I've been involved with over the years. A single expedient choice (or a confluence of several such choices), each seemingly innocuous at the time, can turn into monstrous, expensive maintenance nightmares. Ward Cunningham originally equated this effect with that of (technical) debt [http://en.wikipedia.org/wiki/Technical_debt].

What makes the situation worse for software projects over the air conditioner analogy is that continuous change over the life of a software project offers more and more opportunity for such bad choices, and each change becomes more and more expensive as the system becomes more brittle, until all change becomes prohibitively expensive and the system is mothballed, its replacement is commissioned and a whole new expensive development project is begun. I think of this as a kind of Big Bang Development Model, and in my experience this has been the standard model for the finance industry.

For an in-house development shop, an argument can be made that this might not be so bad, although I wouldn't be one to make it – it’s success is highly dependent on your ability to retain the staff that “know where the bodies are buried”, which in turn is directly proportional to remuneration. If you're a vendor, Big Bang should not be an option - you need to hope that there's no other vendor waiting in the wings when your Big Bang happens.

Of course, today we try to mitigate the impact of bad choices with a combination of unit testing, iterative refactoring and abstraction, but all of this requires management vision, discipline, good governance and three or four of the right people (as opposed to several dozens or hundreds of the wrong ones). Those modern software engineering tools are also co-dependent: effective unit testing requires appropriate abstraction; fearless refactoring requires broad, automated testing; sensible abstractions can usually be refactored more easily than inappropriate ones when the need arises to improve them.

I'm not going to "refactor" my new $4000 compressor, but you can bet that I am going to use a $10 cover and a $5 can of oil.