Monday, October 26, 2009 
DevDays Boston photos

I spent last Wednesday at Benaroya Hall, attending the Seattle edition of StackOverflow's traveling DevDays conference. It was well worth $99.

Joel Spolsky, owner of FogCreek Software and co-founder of StackOverflow, opened the conference with a keynote about the dichotomy of power and simplicity. People are happier when not overwhelmed with choices. Many of the choices that software forces users to make are essentially meaningless to the users. However, even though people want simplicity, they also want features and different people use different features. Powerful software sells more copies.

He argues that developers and designers should put in the extra work to make good choices on behalf of the users: don't make users feel bad about themselves. Undo is better than a confirmation dialog. You are not in charge of what your users do.

Scott Hanselman spoke about ASP.NET MVC. We're moving away from ASP.NET to Python, but if we were to use ASP.NET again, MVC would be a compelling feature. His presentation was entertaining, if gimmicky.

Rory Blyth introduced iPhone development, in a tone of snarky ambivalence. He mentioned the Stockholm Syndrome. He stressed that Apple's Design "Guidelines" are effectively laws: violate them and you won't make it into the App Store. Looks like there's a lot of tedious messing around to hook things up in Objective-C. At the very end, he briefly demoed MonoTouch, which seemeed a little less tedious.

Cody Lindley introduced jQuery. I've done a lot of work with jQuery, but I still learned a few things. He worked through five facets of jQuery: Find something, do something; Create something, do something; Chaining; Implicit iteration; and jQuery parameters. He has an ebook at jqueryenlightenment.com, which I just picked up.

Daniel Rocha of Nokia talked about the cross-platform Qt (/cute/) toolkit, which runs on Windows, Mac, and Linux. More importantly from Nokia's point of view, it runs on their smartphones. Nokia has changed the licensing of Qt—once very expensive for closed-source apps, it's now free for apps that don't modify the Qt source. Qt is for C++, but there are bindings for other languages, such as Python.

Joel Spolsky came back and treated us to a half-hour demonstration of Fogbugz 7, Evidence-Based Scheduling, and Kiln, their new hosted Mercurial repository. Not terribly interesting to me, but the conference was only $99.

Ted Leung gave us a rather dry Hacker's Introduction to Python from slides rendered unreadable by a poor choice of colors. I've done a lot of Python, so I didn't learn much new. pip is an easy_install replacement that uninstalls; zc.buildout assembles apps from multiple parts; bpython is a fancy REPL.

Dan Sanderson talked about Google App Engine and demoed building apps with Java and with Python. Looked pretty cool and straightforward. We probably won't go that route, since we're pushing data to Amazon's S3, so EC2 makes more sense for us.

Finally, Steve Seitz from the University of Washington gave a cool talk on Modeling the World from Internet Photos. Some of this technology ended up in Photosynth. See Building Rome in a Day for some demos.

posted on Monday, October 26, 2009 7:16:14 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Friday, October 16, 2009 
C is for Cookie

Over the last few weeks, I built a PHP application that overlays Approve 71 banners on profile pictures. The actual application is hosted in an iframe and lives on a server in a different domain, eq.dm, than the main server at approvereferendum71.org.

This works fine in most browsers. Then we started getting reports that it wasn't working in IE8 on Win7 RC1. The iframe content was blank.

Poking around, I found the problem with the Fiddler proxy. The landing page on eq.dm was supposed to stick some information into the PHP session, then redirect to a second page at the same site. The second page was in an endless loop, redirecting to itself. In Fiddler, I saw a different PHPSESSID cookie on each response, and no cookie in the requests.

After reading IE 8 only has access to session cookies, I told IE8 to Accept All Cookies and the iframe content appeared. That fixed it for me, but we could hardly ask people to lower their security sessions.

I created a P3P file for the second domain, using the IBM P3P Policy Editor. (KB 323752 has more background on P3P and third-party cookies.)

IE now worked at its default security level. Problem solved! Or so I thought.

A day later, we got reports of similar problems with Safari 4 on Mac OS X.

I sniffed the traffic with Wireshark. Same problem: the “third-party“ cookie wasn't being accepted by Safari.

Unfortunately, Setting cross-domain cookies in Safari indicated that there was no reasonable workaround.

We overcame the issue up playing some DNS games, which was only possible because we control both servers. The second server is now also acting as a subdomain of the first, at dev.approvereferendum71.org. We used ini_set("session.cookie_domain",".approvereferendum71.org") to scope the iframe cookies. I've tried it in a variety of Windows, Mac, and Linux browsers, and it works in all of them.

posted on Friday, October 16, 2009 7:15:10 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Wednesday, October 14, 2009 
Git logo

In the last few weeks, I've switched over to Git for most of my version-control needs, at home and at work, after putting it on the long finger for months.

We continue to use Subversion at work, but I've recently followed Pavel and Eric's lead in using git-svn. I work locally on my own private branches and git svn dcommit and git svn rebase occasionally. I'm primarily on Windows at work, but I have a Linux box and a Mac Mini too, while at home, I have a MacBook, a Linux netbook, and a Vista desktop. I'm using msysGit, occasionally supplemented by TortoiseGit and QGit. Pavel's on a Mac and Eric's mostly on Ubuntu, so git adoption was easy for them.

When I first tried git-svn under msysGit about a year ago, it didn't work worth a damn. Git-svn works fine now, but it's slow compared to the *nix implementation. The developers say that's due to the fork() emulation of the MSys/Cygwin layer. The rest of msysGit is much faster.

For my home needs, I've had private Subversion repositories at DevjaVu.com and OpenSvn.csie.org. DevjaVu has gone out of business and OpenSvn has been unavailable too often for my liking. It was time to find some new hosting.

I've experimented with private Git repositories at GitHub and ProjectLocker. GitHub is very nice, but charges for private repositories. ProjectLocker provides free private repositories, but is comparatively clunky.

ProjectLocker lets you set up a fresh repository on their server. They tell you how to clone from that, which is great for a new repository. But they don't tell you how to hook it up to an existing local repository. Since I had some difficulty in figuring it out, here's the recipe:

git remote add origin git-foobar@freeN.projectlocker.com:foobar.git
git pull origin master
... merge, local edits and commits ...
git push origin master

I found Git, Xcode and ProjectLocker and Cygwin, SSH and ProjectLocker useful in figuring this out.

posted on Thursday, October 15, 2009 6:56:59 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Friday, April 24, 2009 
Sprints

Scrum and Agile revolve around sprints. At my previous employer, I spent two years working in one-week sprints. At my current job, I've spent another two years working in four-week sprints.

Each has their own rhythm. We ran the one-week sprint from Wednesday to the following Tuesday. Wednesday morning, we'd demo the previous week's work and we'd plan, drawing up a series of task cards, measured in hours. With a one-week horizon, you couldn't go very far off track. You can't get a huge amount done in a week either. You need to have a bigger picture in mind that transcends several weeks. We released every couple of months.

On the first Monday of the four-week sprint, we review the sprint backlog and break down the features into finer grained tasks. In the fourth week, we look at the product backlog and prioritize the features to go on to the next sprint's backlog. Features are measured as Small (1 week), Medium (2 weeks), or Large (4 weeks). On the fourth Friday, we have demos. We also estimate our velocity for the next sprint, based on how much we delivered in the current sprint. This determines how much we sign up for at the beginning of the next sprint.

With the four-week sprint, you build up momentum and you have enough time to deliver significant functionality. The planning is harder though.

I prefer the rhythm of a four-week sprint, but I could go back to the shorter one.

Today is the last day of a four-week sprint. We got a lot done, though it came down to the wire yesterday.

posted on Saturday, April 25, 2009 6:55:03 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Monday, March 02, 2009 
Worst-case hash table collisions

At lunch today, I told Eric about Hash Attacks: for many hash functions, it's possible to construct a large set of keys that collide. This can be used to cause a Denial of Service as hashtable operations can be induced to take O(n) time instead of O(1).

Crosby and Wallach successfully demonstrated this against a number of applications.

Andrew has a good writeup of Hash Algorithm Attacks.

There are various mitigations suggested. The one that I used when I first became aware of this problem is to use a salt to the hash function.

In other words, change:

unsigned hash(const char* s)
{
    unsigned h = 0;
    while (*s)
        h = h * 101 + (unsigned char*) *s++;
    return h;
}

to:

unsigned hash(const char* s)
{
    unsigned h = SALT;
    while (*s)
        h = h * 101 + (unsigned char*) *s++;
    return h;
}

where SALT is chosen randomly when the hash table is created or when the process starts. This should be enough to vary the order in which keys are distributed to buckets from run to run.

posted on Monday, March 02, 2009 8:04:28 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Thursday, February 26, 2009 
Stack Overflow

I like Stack Overflow, Jeff Atwood's programming Q&A site. It's quickly become a go-to place for all kinds of programming questions. It's certainly easier to find a definitive answer there than trying to wade through a thread in a mailing list archive. The social dynamics seem to be working and a definite community has evolved.

I've been going there more often recently. I browse the hot questions and I often learn something from them.

I'm answering some questions too. I've been doing this for twenty years on Usenet and mailing lists. I might as well get a little credit for it on SO. My reputation is 131 as I write this: I expect it will grow.

posted on Friday, February 27, 2009 3:23:26 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Wednesday, February 25, 2009 
Interviewing your next boss

Esther Schindler has a post about interviewing your next boss: should a candidate dev manager meet everyone who'll be reporting to them?

Yes. Definitely. If you want a successful, cohesive team, there has to be trust. A manager can make or break a team.

A new manager starts at a disadvantage, relative to a new individual contributor. The new dev is expected to ramp up and have time to build relationships with the team. The new manager has to build the relationships as soon as possible.

If the manager gets to interview with the team before being offered the job, both parties benefit. Why would you want to manage a team that you'd never met? Shouldn't the team have a chance to reject someone who's a bad fit?

I've interviewed “up” twice, once for a dev manager and once for a CTO, at different jobs. But those were the exceptions. Every other time there was a change of manager at any job, I was not consulted.

The two interviews were successful: I'd work for either of them again.

The dev manager went through several hours of interviews with the team, meeting us two or three at a time. We interviewed developer candidates similarly. We asked different questions of the manager, of course. I remember asking him if he had ever laid off or fired someone.

The CTO got to meet the entire engineering team en masse: a dozen or so of us grilled him for an hour and came away impressed.

I think that in either case if the teams had made serious objections, the candidate would not have been hired. Certainly, in both cases, the teams vetoed developer candidates.

It may be traditional, but I think it's a mistake for companies not to have managers be interviewed by their future reports.

posted on Thursday, February 26, 2009 6:43:14 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Thursday, February 05, 2009 
Bowling Score Sheet

There's a flamefest going on at the moment between Robert "Uncle Bob" Martin and Joel Spolsky over the value of Test-Driven Design and the SOLID principles. I find TDD valuable and I'm reading Martin's Clean Code at present.

Poking around in the links led me to Uncle Bob's Bowling Game Kata, a Powerpoint deck demonstrating using TDD to score a bowling game.

Ron Jeffries has a very ugly OO implementation and a cleaner procedural version of the Bowling Game. Digging around in the archives of his XP Magazine turns up many other ruminations on the Bowling Game

At Atlas, I was loaned to one group that used the Bowling Game for a pair-programming interview. I found it to be a valuable exercise. It showed us whether the candidate could actually code or not and it gave us a feel for what it would be like to work with them. It gave the candidate a taste of Agile work practices like TDD and pair programming. Of course, in a real pair-programming exercise, I would have been actively making suggestions instead of holding back.

We interviewed four candidates while I was on that team. Two passed, were hired, and worked out. One failed, failed other interviews, and was eliminated. The fourth candidate was very experienced, gave great whiteboard while talking through the exercise at the beginning, and turned out to be completely horrible. He floundered badly and wrote ugly, buggy code. That eliminated him, even though he had done well on the other rounds.

posted on Thursday, February 05, 2009 8:56:04 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Monday, November 10, 2008 
Distributed/Decentralized Version Control Systems

At work, I've been experimenting with the big three Distributed Version Control Systems, Git, Mercurial, and Bazaar, on Windows over the last ten days.

Pavel and Eric have been singing the praises of Git and git-svn on their Mac and Linux boxes respectively for the last few months. Git allows them to check in small changes locally without perturbing the build. The ease of branching and merging allows them to work in more than one branch at a time at a lower cost than Subversion did. Most of our dev team continue to work in Subversion on Windows boxes. git-svn allows Pavel and Eric to easily interoperate with the Subversion server. Pavel is also a big fan of git-stash: he stacks away in-progress work and switches easily to other patches.

Although I've worked primarily in Python on Linux since the summer, I've been working on our forthcoming mobile client recently. It's ASP.NET-based, hence I'm working on Windows again. I'm in the throes of a major refactoring, extracting the mobile client out of the main webclient and hoisting other code into shared projects, while other developers continue to work on the main webclient and the mobile client.

This seemed like a perfect opportunity to bite the DVCS bullet, since I knew that branching and merging would be less painful with git-svn than with Subversion.

Getting git-svn working on Windows turned out to be a major headache. The Cygwin version of git-svn simply doesn't work for me. And msysGit doesn't currently support git-svn. (Eric has had some success with an older version of msysGit and git-svn, but I found it to be wretchedly slow.) Moreover, Git's integration with Windows is poor. There's nothing like TortoiseSVN to ease developers into using Git.

Having written off Git on Windows for now, it was time to try Bazaar (bzr), which has its own Subversion plugin, bzr-svn. The version of bzr-svn that was available for Windows the week before last was ancient, and promptly crashed. Jelmer, the developer, mailed me yesterday to say that there should be an up-to-date copy of bzr-svn in the brand new 1.9 release of Bazaar. I'll try it at work tomorrow. Windows doesn't seem like an afterthought for Bazaar; indeed, TortoiseBzr offers Explorer integration.

On to Mercurial (hg). Alas, this has the weakest integration with Subversion. There are instructions for doing it by hand (which is what I'm doing). The hgsubversion extension looks promising, but is still immature.

Even so, Mercurial is what I've ended up using for the last week. Partly because it didn't bite me. Partly because I like it best of the three. The Mercurial book takes much of the credit for that. Windows is a first-class client and TortoiseHg offers half-way decent Explorer integration.

I'm not impressed with Git as software engineering; it strikes me as an incoherent mess of C and Perl. The attitude of superiority from some Git proponents is off-putting. I watched Linus Torvalds' Google techtalk about Git on Friday; he came across as a major jerk, repeatedly calling anyone who uses Subversion an idiot. I'd still recommend watching the video: it gives good insight into the social aspects of distributed/decentralized VCSes, how very different they are from traditional centralized VCSes, and how they afford a different way of working.

Watching my compatriot Bryan O'Sullivan's Google techtalk on Mercurial this afternoon was a far more pleasant experience. He talks more about workflow and implementation.

Both Bazaar and Mercurial are written in Python and seem to be fairly well architected. Frankly, if I do have to get my hands dirty in the code (e.g., hgsubversion), I'd much rather hack in Python. I did C/C++ for fifteen years and I'm sick of unmanaged code.

Anyway, Mercurial is where I'm going for now, though I won't categorically rule out Bazaar or Git.

posted on Monday, November 10, 2008 8:19:23 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [2]
Monday, March 10, 2008 
Deadlock in Real Life

Over at Cozi, we've started a new technical blog. I just put my first post up, describing a nasty problem we had late last year.

Here's the summary:

Internet Explorer 6 does not support transparency in PNG images. The best-known solution is to use the DirectX AlphaImageLoader CSS filter. It's less well known that using AlphaImageLoader sometimes leads to a deadlock in IE6. There are two workarounds. Either wait until after the image has been downloaded to apply the filter to the image's style, or use the little-known transparent PNG8 format instead of the filter.

More here.

posted on Monday, March 10, 2008 9:47:32 PM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Wednesday, January 09, 2008 

http://images.amazon.com/images/P/073571410X.01.MZZZZZZZ.jpg

Title: Defensive Design for the Web
Author: 37 Signals
Rating: 3.5 stars out of 5
Publisher: New Riders
Copyright: 2004
ISBN: 073571410X
Pages: 246
Keywords: programming, web
Reading period: 23 December, 2007 - 9 January, 2008

This book contains 40 usability guidelines for websites, ranging from Eliminate the Reset button and disable the Submit button after it's clicked to Give an error message that's noticeable at a glance to Be upfront about item unavailabiity. The topics include error messages, clear instructions, friendly forms, overcoming missing pages, helpful help, obstacles to conversion, and search.

When I state them that baldly, they sound obvious. But they're not. The 37 Signals guys have amply illustrated each guideline with examples of sites that violated the guideline, and sites that exemplify the guideline. The examples are well chosen and bolster their points.

The book feels padded, however. They could easily have reduced the page count by two-thirds. Indeed, an earlier version of this book is available as a 17-page whitepaper. It was certainly worth the $6 that I paid for it at Half-Price Books, but I think I'd feel cheated if I had spent $25 on it.

The book refers to a companion website, DesignNotFound.com. This site is no longer available, which I find unforgivable. It's such a complete contradiction of the principles they advocate. The Wayback Machine reveals the original site.

posted on Thursday, January 10, 2008 7:59:22 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Sunday, November 04, 2007 

http://images.amazon.com/images/P/0321509021.01.MZZZZZZZ.jpg

Title: Bulletproof Web Design, second edition
Author: Dan Cederholm
Rating: 4.5 stars out of 5
Publisher: New Riders
Copyright: 2007
ISBN: 0321509021
Pages: 312
Keywords: css, web
Reading period: 10-29 October, 2007

Cederholm clearly explains the CSS techniques required to build a "bulletproof" website: one that is robust in the face of text resizing, window resizing, disabled images, etc, with minimal, semantically correct markup that works across all the major browsers.

Anyone who's serious about building a modern website should read this book.

Cederholm builds up his examples, one step at a time, in a clear manner. For the shorter examples, he tends to show the entire CSS or XHTML again and again, with the latest changes highlighted in orange. I would have preferred him to strip out the unneccessary repetitive material. Otherwise, great book.

posted on Monday, November 05, 2007 5:09:04 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Thursday, October 25, 2007 

http://www.georgevreilly.com/blog/content/binary/ErEr.png

I've grown fond of the JavaScript || idiom:

 function FrobImage(img) {
var width = img.width || 400;
var height = img.height || 300;
// ...
}

FrobImage({height: 100, name: "example.png"});

If img.width exists and it's truthy, then width = img.width; otherwise, width = 400. Here, it will be 400 since the img hash has no width property. More than two alternatives may be used: x = a || b || c || ... || q;

A few weeks ago, while cleaning up the error handling in some batch files, I came across a similar idiom:

 foo.exe bar 123 "some stuff"  || goto :Error

Only if foo.exe fails (exit() returns a non-zero value), is the second clause executed.

Perl's die is typically used in a very similar idom:

 chdir '/usr/spool/news' || die "Can't cd to spool: $!\n"

though the or keyword seems to be preferred nowadays to ||.

This morning, I came across the ?? operator in C# 2.0, aka the null coalescing operator:

 Customer cust = getCustomer(id) ?? new Customer();

If getCustomer(id) is not null, then that's the value that cust gets; otherwise it's set to new Customer().

All of these idioms are syntactic sugar and all of them are in my toolbox.

posted on Thursday, October 25, 2007 7:12:34 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Sunday, September 09, 2007 

http://images.amazon.com/images/P/0596529260.01.MZZZZZZZ.jpg

Title: RESTful Web Services
Author: Leonard Richardson, Sam Ruby
Rating: 4.5 stars out of 5
Publisher: O'Reilly
Copyright: 2007
ISBN: 0596529260
Pages: 419
Keywords: programming, web services, REST
Reading period: 22 August-8 September 2007

Anyone who has attempted to build a Web Service has come away scarred by the complexity of all the WS-* standards. Heavyweight standards that in many ways reinvent earlier distributed object technologies like CORBA and DCOM, providing Remote Procedure Calls over HTTP. The promised interoperability hasn't really happened: a web service built with one stack of tools may or may not be consumable by another stack.

A movement has arisen in the last few years, arguing for RESTful Web Services: lighterweight services built on top of the REST architectural style with simpler tools.

Big Web Services expose algorithms and method calls. ROA (REST-oriented architecture) web services expose data (resources) through the simple, uniform interface of HTTP.

I'm not going to try to explain REST or ROA here. Poke around the book site and the RESTwiki if you want more details.

I think this book is destined to be a minor classic. It explains the REST-oriented architecture very clearly. It works through several plausible examples, building services and clients in a variety of languages (most notably Ruby on Rails). It's not intimately tied to one software stack, which means that the book will still be useful five years from now. In part, that's because the tools support is fairly weak. As far as I can tell, you're reduced to rolling your own ROA web service from scratch in .NET, for example.

I haven't had to dig very deeply into WS-*, fortunately, but I haven't cared for what I've seen. The authors don't spend a lot of time critiquing what they see as the shortcomings of SOAP and the WS-* standards, but I'm not equipped to find fault in what they say. What they do say, sounds reasonable to me.

Recommended.

posted on Sunday, September 09, 2007 8:36:05 PM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Saturday, June 09, 2007 

http://www.webdesign.org/img_articles/2262/Multilingual.jpg

This week, I have written code in C#, C++, Managed C++, C, WiX, NAnt, ActionScript, VBScript, JScript, cmd batch, NMake, HTML, XSLT, and Ruby. And I will probably get some Python in before the weekend is over. <boggle/>

posted on Saturday, June 09, 2007 8:35:27 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Saturday, February 10, 2007 

http://www.georgevreilly.com/blog/content/binary/R6034.png

I have been cleaning up some issues with the Win64 port of Vim, including the Edit with Vim shell extension not working very well. When I built the shell extension with VS 2005 on x86, I would get the following whenever I right-clicked in Explorer:

Microsoft Visual C++ Runtime Library

Runtime Error!

Program: C:\WINDOWS\Explorer.EXE

R6034

An application has made an attempt to load the C runtime library incorrectly. Please contact the application's support team for more information.

There was no mention of which application was at fault, though it was obvious in this case. I have also seen some mention of verclsid in the error dialog, though not when I took this snapshot.

The underlying problem relates to SxS, Fusion, and all that good stuff. By far the simplest fix was for me to statically link with libcmt.lib, instead of msvcrt.lib, rather than figure out the necessary manifest magic.

posted on Sunday, February 11, 2007 3:37:26 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Tuesday, February 06, 2007 

http://www.georgevreilly.com/blog/content/binary/printf.png

In my post about Printf Tricks a couple of years ago, I mentioned that "%n is dangerous and disabled by default in Visual Studio 2005."

I got email today from someone who was porting a large codebase to VS 2005. He was getting an assert from %n and he needed a way to get past it. He intends to fix the uses of %n when he has a chance.

I spent several minutes digging around in MSDN and came up with set_printf_count_output. Wikipedia's Format string attack page led me to Exploiting Format String Vulnerabilities, which describes in detail how %n (and %s) may be exploited.

In short, if you have printf(unvalidated_user_input), instead of printf("%s", unvalidated_user_input), then placing %n into unvalidated_user_input can lead to printf writing arbitrary data into memory.

posted on Wednesday, February 07, 2007 7:19:18 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Friday, September 01, 2006 

My colleague, Greg, and I spent all day debugging a build break in some unit tests that exercise a webservice interface in legacy .NET 1.1 code. Last night, the tests stopped working on our CruiseControl.NET build server. We couldn't understand it. The tests had been working for months. Now we were getting timeouts in SOAP. The tests essentially mock a SOAP service using the soap.inproc transport and a stub implementation that signaled an event to acknowledge a method being called.

The only thing that had changed in the code tree was that another colleague, Pavel, had discovered that two of our .csproj files somehow shared the same GUID, and had repaired that. But that could hardly have any effect on the WSE2 runtime. Could it?

Turns out that it was the cause of the break. NAnt 0.85 rc2 and rc3 silently failed to build the NUnit assembly because of the duplicated GUIDs. The assembly was not getting propagated to the directory where all the other NUnit assemblies are placed. The CC.NET task that ran the tests never noticed the missing assembly because the test was couched in terms of *.NUnit.dll. And we never noticed that the test hadn't been run in months because we have ~20 such NUnit assemblies, and the NUnit summary output goes on for several screens in CC.NET.

Morals of the story

  1. Use NAnt 0.85 rc4, which detects the GUID collision and treats it as a fatal error.

  2. Create .csproj files through the IDE, not by taking an existing file and hacking on it. (At least, that's we assume happened.)

  3. Assumptions can bite you. We assumed that the code was being run all along, so it took us several hours to draw the connection between Pavel's checkin and the failing NUnit assembly.

  4. Don't mock a webservice by implementing a dummy SoapReceiver, hauling in the WSE runtime and a boatload of non-determinism. (Instead, make fun of its dress sense.) For our newer code, we've been taking an approach like this, using partial classes and Rhino Mocks.

  5. We have also taken to including our test fixtures in the same assemblies as the code they test. I have mixed feelings about this: it offends my sensibilities to have all this test code compiled into production code. But it would certainly have been hard to miss the build break in production code.

posted on Saturday, September 02, 2006 3:17:42 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Monday, May 15, 2006 

A few weeks ago, I wrote a C++ routine to parse decimal numbers using the overflow detection principles of SafeInt. I couldn't find anything in the libraries that actually did a good job of checking for overflow.

Briefly, to see if unsigned values A+B overflow, check if (A > MAX_UINT - B). Similarly, A*B will overflow if (A > MAX_UINT / B).

// Convert a string to an unsigned. Returns 'true' iff conversion is legitimate. bool StringToUnsigned( const string& str,
unsigned& rUint)
{
rUint = 0;

if (str.empty())
return false;

for (unsigned i = 0; i < str.length(); ++i)
{
if (!isdigit(str[i]))
return false;

// Check for numeric overflow. if (rUint > numeric_limits<unsigned>::max() / 10)
return false;
rUint *= 10;

unsigned d = str[i] - '0';
if (rUint > numeric_limits<unsigned>::max() - d)
return false;
rUint += d;
}

return true;
}

While debugging this code, I noticed something interesting. 0xFFFFFFFF divided by ten (0xA) is 0x19999999. This pattern holds for smaller and larger sequences of 0xFF...FF too: 0xFF/10 = 0x19; 0xFFFF/10 = 0x1999; and so on.

I'm not sure how to prove this, but I can prove the closely related result: 0x19...99 * 10 = 0xFF...FA:

 10 * N         = 8 * N  +  2 * N
10 * 0x19...99 = 8 * 0x19...99 + 2 * 0x19...99

0x199...99 = %0001 1001 1001 ... 1001 1001

10 * 0x19...99 = %1100 1100 1100 ... 1100 1000
+ %0011 0011 0011 ... 0011 0010
= %1111 1111 1111 ... 1111 1010

A mildly curious result of no value, but it amused me.

posted on Tuesday, May 16, 2006 5:24:53 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Friday, April 28, 2006 

I'm writing some C++ code at the moment, after months of C#. I'm trying to be very Test First, writing Red tests, then making them turn Green.

I'm also using CppUnit for the first time. It's not as easy as NUnit. You can't just declare your test method with an attribute, you have to declare the test method in a header file, place it inside a macro, and then have the test implementation in a .cpp file. And there's no nunit-gui. I'm using a post-build step to run the tests, which makes it fairly pain free.

There was one internal method that I didn't have an explicit test for, although I had tests for methods that called it. The main obstacle was that I didn't have a simple way to check the result, as the method returned a vector of objects. I didn't want to have to construct another vector of expected results.

Then it came to me: I could wrap the vector in a class and write a ToString() method for it (as well as a ToString() for the contained objects), and compare that to a string constant:

 RateList result = creative.GetRates();
CPPUNIT_ASSERT(result.ToString() == "100_4x3:100_16x9|200_16x9|400_4x3:400_16x9");

In retrospect, it should have been obvious. I already have ToString() methods for many of my other objects, and I'm using CPPUNIT_ASSERT(actual.ToString() == expected) in many of my unit tests. The extra step of writing ToString() for the collection blocked my thinking.

posted on Saturday, April 29, 2006 3:22:21 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Wednesday, March 01, 2006 

I needed to add some declarative error checking to some XSLT templates recently. Specifically, I wanted to throw an error if my selects yielded an empty string, indicating that the input XML was wrong.

Unfortunately, there seems to be no easy way of doing this in XSLT, nor in XslTransform. The approved way is to validate against an XSD schema, but for various reasons, I didn't want to go to the hassle of creating one.

I found a partial solution using xsl:message with the terminate="yes" attribute. Under XslTransform.Transform() the following code throws an exception if the XPath expression is empty.

 <xsl:if test="not(/some/xpath/expression)">
     <xsl:message terminate="yes">Missing expression</xsl:message>
 </xsl:if>
 <xsl:value-of select="/some/xpath/expression" />

It doesn't do anything, however, in XMLSpy.

The downside, of course, is that you have to maintain the expression in two places, and the template becomes littered with those annoying tests.

posted on Thursday, March 02, 2006 5:43:08 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Tuesday, January 31, 2006 
posted on Tuesday, January 31, 2006 11:14:08 PM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Sunday, January 22, 2006 

I've been trying to make Vim 7 compile with the Microsoft Visual C++ 2003 Toolkit, as a favor to Bram Moolenaar, the primary author of Vim. He wants to be able to use the free compiler as the primary build tool for the Win32 version of Vim.

Oh. My. God.

The VC2003 toolkit may include a full optimizing compiler, but it's certainly far from a complete system for building Windows binaries.

First, I discovered that it came only with the C library headers, but not the Windows headers. That was easily rectified. Download the Platform SDK. Just the Windows Core SDK subset. This also got me nmake.

At this point, I was able to compile Vim, but not to link it. The linker required cvtres.exe, to link some resources. Some googling showed me that this is included in the .NET Runtime.

The main Vim executable now linked, but the shell extension DLL didn't. I didn't have msvcrt.lib! It took me more detective work to learn that I'd have to install the .NET Framework SDK to get msvcrt.lib. There are several clever hacks out there that generate msvcrt.lib from msvcrt.dll, with the help of link -dump -exports and a sed script, but these do not include the all-important _DllMainCRTStartup@12, the real entrypoint for DLLs linked with msvcrt.

All the necessary steps for getting the downloads are summarized on the Code::Blocks wiki. Code::Blocks is an open-source IDE that can host the VC2003 toolkit, GCC, and a number of other compilers.

So why bother with the VC2003 toolkit, since Visual C++ 2005 Express Edition is freely downloadable?

The main reason is that it's free only for the first year, and Bram wants something that will still be available after November 2006, so that anyone can compile it.

I have also ported Vim 7 to compile with VC2005 Express. It was fairly straightforward, after I had added the following

 #if _MSC_VER >= 1400
# define _CRT_SECURE_NO_DEPRECATE
# define _CRT_NONSTDC_NO_DEPRECATE
#endif

to shut up the warnings about deprecated CRT functions. I also had to make it link with libcmt.lib (multithreaded) instead of libc.lib, as the single-threaded static library is gone.

I still need to make sure that everything continues to work with the retail compilers, VC6, VC7.1, and VC8, before passing my changes back to Bram. Sigh.

Update #1: I almost forgot. VC2005 Express also requires the Platform SDK to build Vim.

I'll send the diffs to Bram in about a week. I'm too busy to clean everything up this week.

Update #2 (2006/03/12): I sent updates to Bram a week ago and he's checked them into the Vim7 source tree. Be sure to read src/INSTALLpc.txt, section 1, for details on compiling Vim with VC5-VC8.

Update #3 (2006/04/22): VC2005 Express is now free forever. Vim7 is in beta and will be released soon, and Bram doesn't want to switch compilers at this point.

posted on Sunday, January 22, 2006 9:17:05 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Saturday, January 14, 2006 

I re-read Scott Hanselman's blog post on using Consolas as the Windows Console Font, and I decided to put together a registry file to make it a little simpler. (You'll have to rename the file to console-font.reg after downloading.)

The registry file includes entries for:

As Scott says:

(I'm afraid I can't distribute Consolas online or provide a download out of abject fear. That said, you can find it in any version of the Longhorn bits.)

Or Office 12, I believe.

Update, 2008/01/15. The Consolas Font Pack is the easiest way to get Consolas, if you don't have Office 2007 or Vista. Technically, you are supposed to have Visual Studio 2005. (I'm guessing that VS 2008 comes with Consolas.)

posted on Sunday, January 15, 2006 12:57:59 AM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Thursday, November 17, 2005 

A year ago, I ran into a problem with Skype squatting on port 80, which I had long forgotten about. Today, I ran into one with Skype squatting on port 443.

I was trying to set up SSL on my Windows Server 2003 dev box. My ultimate goal is to experiment with client certs and server certs for SOAP, but that's a story for another time. I was running into all kinds of strange problems, exacerbated by the relatively strange IIS configuration on my machine.

I tried SslDiag. In hindsight, it pointed me towards the underlying problem, but I couldn't see it at the time. I did a lot of digging around on Google. Eventually, a newsgroup thread on ListenOnlyList gave me CurrPorts, which showed me that Skype was listening on port 443. I suppose netstat -anob, TcpView, or Port Reporter would have told me the same thing, though CurrPorts had the friendliest view. WFetch from the IIS 6 Resource Kit Tools was also useful in looking at raw requests and responses.

posted on Thursday, November 17, 2005 9:18:57 PM (Pacific Standard Time, UTC-08:00) 
#    Comments [0]
Thursday, August 11, 2005 

I'm a command-line dinosaur. Vim (Vi IMproved) is my favorite text editor. And I write quite a few little batch files.

Here are a few useful tricks that work with cmd.exe on Windows XP.

Timestamped filename

Sometimes I want to create a file whose name includes the current date and time. By combining the magic %DATE% and %TIME% environment variables, with for /f and a little bit of string substitution, I can create that filename.

REM
REM "Tue 06/14/2005" -> "06/14/2005"
REM
for /f "tokens=2" %%i in ("%DATE%") do set MDY=%%i
REM
REM "06/14/2005" -> "2005-06-14"
REM
for /f "delims=/ tokens=1,2,3" %%i in ("%MDY%") do set YMD=%%k-%%i-%%j

REM "16:44:39.72" -> "1644"
REM
for /f "delims=: tokens=1,2" %%i in ("%TIME%") do set HM=%%i%%j
REM
REM " 237" -> "0237" (%TIME% < 10:00:00.00 contains a leading space)
set HM=%HM: =0%

xcopy /yf %1 %YMD%_%HM%.bak

See for /? and set /? to explain everything that the comments don't.

Timing Operations

Sometimes it's useful to time operations.

@setlocal
@if (%_echo%)==() set _echo=off
@echo %_echo%

call :time T1
set T2=%T1%
set Iter=0
@echo T1 = %T1%

:repeat
CostlyOperation.exe

call :time T2
set /A DeltaT=%T2% - %T1%
set /A Iter=%Iter% + 1
set /A Avg=%DeltaT% / %Iter%
@echo DeltaT = %DeltaT%, Avg = %Avg%, Iter = %Iter%, T2 = %T2%
goto :repeat


:time
set TT=%TIME%
for /f "delims=: tokens=1" %%i in ("%TT%") do set hrs=%%i
for /f "delims=: tokens=2" %%i in ("%TT%") do set min=1%%i
for /f "delims=: tokens=3" %%i in ("%TT%") do set sec=1%%i
for /f "delims=. tokens=1" %%i in ("%sec%") do set sec=%%i
set /A %1=3600 * %hrs% + 60 * (%min%-100) + (%sec%-100)
goto :EOF

The :time subroutine calculates the number of seconds that have elapsed today. The business with 100 is to handle the case that min or sec is 08 or 09, which Cmd's expression evaluator considers to be malformed octal.

set /? explains set /A arithmetic. call /? explains subroutine syntax and goto :EOF.

Extending this code so that it works past midnight is left as the proverbial exercise for the reader.

posted on Thursday, August 11, 2005 7:26:57 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]
Thursday, June 02, 2005 

Printf Tricks

It may be old-fashioned, but I still find printf (and sprintf and _vsnprintf) incredibly useful, both for printing debug output and for generating formatted strings.

Here are a few lesser-known formats that I use again and again. See MSDN for the full reference.

%04x - 4-digit hex number with leading zeroes

A quick review of some of the basics.

%x prints an int in hexadecimal.

%4x prints a hex int, right-justified to 4 places. If it's less than 4 digits, it's preceded by spaces. If it's more than 4 digits, you get the full number.

%04x prints a hex int, right-justified to 4 places. If it's less than 4 digits, it's preceded by zeroes. If it's more than 4 digits, you get the full number, but no leading zeroes.

Similarly, %d prints a signed int in decimal, and %u prints an unsigned int in decimal.

Not so similarly, %c prints a character and %s prints a string. For wide (Unicode) strings, prefix with l (ell, or w): %lc and %ls.

Note: For the Unicode variants, such as wprintf and friends, %c and %s print wide strings. To force a narrow string, no matter which variant, use the %h size prefix, and to force a wide string, use the %l size prefix; e.g., %hs and %lc.

%p - pointer

The wrong way to print a pointer is to use %x. The right way is to use %p. It's portable to Win64, as well as to all other operating systems.

Everyone should know this one, but many don't.

%I64d, %I64u, %I64x - 64-bit integers

To print 64-bit numbers (__int64), use the I64 size prefix.

%Iu, %Id, %Ix - ULONG_PTR

ULONG_PTR, LONG_PTR, and DWORD_PTR are numeric types that are as wide as a pointer. In other words, they map to ULONG, LONG, and DWORD respectively on Win32, and ULONGLONG, LONGLONG, and ULONGLONG on Win64.

The I size prefix (capital-i, not lowercase-L) is what you need to print *LONG_PTR on Win32 and Win64.

%*d - runtime width specifier

If you want to calculate the width of a field at runtime, you can use %*. This says the next argument is the width, followed by whatever type you want to print.

For example, the following can be used to print a tree:

 void Tree::Print(Node* pNode, int level)
{
if (NULL != pNode)
{
Print(pNode->Left, level+1);
printf("%*d%s\n", 2 * level, pNode->Key);
Print(pNode->Right, level+1);
}
}

%.*s - print a substring

With a variable precision, you can print a substring, or print a non-NUL-terminated string, if you know its length. printf("%.*s\n", sublen, str) prints the first sublen characters of str.

[2005/7/19: fixed a typo in previous sentence (%.s -> %.*s). A little elaboration on the syntax: . in a printf format specification is followed by the precision. For strings, the precision specificies how many characters will be printed. A precision of * indicates that the precision is the next argument on the stack. If the precision is zero, then nothing is printed. If a string has a precision specification, its length is ignored.]

%.0d - print nothing for zero

I've occasionally found it useful to suppress output when a number is zero, and %.0d is the way to do it. (If you attempt to print a non-zero number with this zero-precision specifier, it will be printed.) Similarly, %.0s swallows a string.

%#x - print a leading 0x

If you want printf to automatically generate 0x before hex numbers, use %#x instead of %x.

Other tricks

See the documentation for other useful tricks.

Security

Never use an inputted string as the format argument: printf(str). Instead, use printf("%s", str). The former is a stack smasher waiting to happen.

%n is dangerous and disabled by default in VS2005.

Don't use sprintf. Use the counted version, _snprintf or _vsnprintf instead. Better still, use the StrSafe.h functions, StringCchPrintf and StringCchVPrintf, to guarantee that your strings are NUL-terminated.

[Update: 2008/01/25: See also Printf %n.]

posted on Thursday, June 02, 2005 7:44:26 AM (Pacific Daylight Time, UTC-07:00) 
#    Comments [0]