12 June 2008

The cost of a bug fix

Every fix doesn’t call for a blog post, but this one deserves it.  It all started when Jonathon Jongsma found a way to make text disappear in QtWebKit on May 27th.  So he raised a bug.  He and I started working on fixing it.  We rapidly found that WebKitGtk was also affected, but it was unreproducible on the Mac port.

We dove into the code: “grep selection”, GraphicsContext::drawText(), Font::drawText()… but nothing was really different (there) in the Qt or Gtk port which could explain why the text wasn’t being redrawn when changing the selection.

That’s when I discovered git bisect.  Since we had established that the bug wasn’t there when QtWebKit was snapshot for Qt 4.4.0, I had a good place to start.  So after recompiling QtWebKit some 15 times (yes, it took around 3 work days!), it pointed me to this changeset.  Lucky for us, it was related to the bug (text rendering).

After some digging into the patch, I contacted the author, Dan Bernstein at Apple, and we looked at it together.  In little time, he was able to find how to reproduce it on the Mac too.  This was now a WebKit wide bug!  Some back traces later and some trials: we came up with this fix. Pretty simple, isn’t it?  barely 16 chars.  Yet, these 16 chars cost around 1200$* in direct labour time and 3 engineers were involved.

Some will say this could have been prevented with proper tests.  It happens that it was a special case on the Mac, but all other ports always went through it.  Dan now added a pixel test.

The morals of the story are:

  • bug fixing is costly (haven’t we heard that in school?)
  • you never know when someone will hunt you back about your patch
  • git is a cool beast (in fact, it just convinced me to use it)

One question lasts: how come it took over a month and a half before someone found it? :)

* This number is based on market mean hourly rate since exact rates are unknown

Side note on the WebKit party

It was really cool to get to San Francisco and finally meet IRL other WebKit devs.   Kudos for the event!

Comments (20)

  1. 12 June 2008
    glandium said...

    2 other morales for your story:
    - Using git bisect would benefit from smaller changesets, which may happen by itself if development switched to git
    - It would also benefit from the build system being safe without needing to make clean before rebuilding.

  2. 12 June 2008
    Pierre-Luc Beaudoin said...

    Yep, those 2 are also totally true! When I found out the faulty commit was a 1000 line one, I was a bit discouraged. And I was forced to do a clean between each, since I discovered that it wasn’t rebuilding much (or nothing) in each iterations.

  3. 12 June 2008

    [...] The cost of a bug fix – Great write-up by pierlux on the “life” of a bug. This one, in my experience, is not an atypical bug in terms of cost/time necessary to find and fix it. [...]

  4. 13 June 2008
    Benjamin Meyer said...

    I have actually seen this twice, but I was waiting to be able to make a reproduction case before filing a bug.

  5. 13 June 2008
    James said...

    The first time I did what you describe (binary search through a changeset history) in Perforce I was sold. One tip is to do a little research before choosing a changeset to jump to. Often the files involved or the changeset comments can hep narrow in on the most likely changesets to try (saving you a few recompiles).

    Once you identify the actual changeset involved and you know the author there is an exhilarating feeling – like a detective must feel when solving a mystery.

  6. 13 June 2008
    John thomas said...

    Wow, thats a pretty costly fix!

    JT

  7. 13 June 2008
    A passerby said...

    If it took 3 days to compile QtWebKit 15 times, then isn’t this more about the tremendous cost associated with long build times?

  8. 13 June 2008
    Pierre-Luc Beaudoin said...

    Well I was doing something else while the computer was building, so this time could be counted differently, right…

  9. 13 June 2008

    That’s also why the saying goes that the majority of the cost of a software system over its lifetime is from maintenance and support. Fixing (and finding) bugs can easily consume large amounts of time.

  10. 13 June 2008
    bob the builder said...

    MORALS

  11. 13 June 2008
    Peter said...

    Where I work, we do daily builds and save all of the build results. It’s a bit of disk space — but have you priced 1T disks lately? Result: when there’s a puzzling bug, we can do a binary search of the different versions (*). That tracks it down to a single day’s changes, and usually it’s obvious from there.

  12. 13 June 2008
    Freemont said...

    Also interesting is the difference in mindset when you are an employee vs. being self employed. As an employee, a bug like this is a stressful conversation with your boss about schedule slippage. But when you’re independent, these bugs represent a complete project standstill. There’s nothing quite so frustrating as giving away 3 unplanned days to a bug!!!

  13. 13 June 2008
    faster said...

    At $1200 a bug, by 5 bugs, you could purchase a super 16/32/RAID6-SAS core server to do compiles. Besides the fun of having a fast computer, it could save money in the long run.

  14. 14 June 2008
    Pierre-Luc Beaudoin said...

    Actually, I was compiling on my dual core laptop, which isn’t too bad. I even made sure that the build was using both cores… it is a big project after all.

  15. 14 June 2008
    Mark Rowe said...

    Woo, WebKit party! For what it’s worth, we frequently make use of the WebKit nightly builds (http://nightly.webkit.org) for tracking down when regressions were introduced. This narrows the range of revisions to check substantially, requiring only two or three builds to find the specific revision that introduced the problem. That, and building on a Mac Pro helps a lot!

  16. 14 June 2008
    Pierre-Luc Beaudoin said...

    Yeah, but unfortunately there aren’t nightly builds of Qt and Gtk ports :) There has been many suggestions in this thread, apparently, the way the port builds need love.

  17. 15 June 2008
    glandium said...

    Pierre-Luc, FWIW, I happen to also have a dual-core laptop and have bisected nasty javascript crashes in the past, and i could build *much* faster than 5 times a day. But that could be because I have 4GB of RAM…

    Also note you can reduce the set of commits to bisect giving a subtree at the beginning of git-bisect, if you suspect the bug is in a specific part of the code.

    As for keeping result of builds, I would suggest the use of ccache, with compression enabled.

  18. 16 June 2008

    [...] fix doesn’t call for a blog post, but this one deserves it”: The cost of a bug fix Spread the [...]

  19. 18 June 2008

    [...] The cost of a bug fix Bajeezus. This reminds me of the poster on the boss-man’s (aka M^2) cube – the cost of fixing defects compared to its placement in the development cycle. It reminds me of an IE7 fix I did in a few hours @ Nationwide that they freaked was going to be huge. (tags: debugging development testing) [...]

  20. 5 July 2008
    Ryan Finnie said...

    I can beat that. This patch, 10 characters, took Jeff Dike and I a solid 2 weeks (basically 9 to 5 days) to bisect, figure out and fix. So in that sense, that bug cost Intel about 12 grand.