Marc Hughes


Home
Blog
Twitter
LinkedIn
GitHub
about
I am a developer from a bit west of Boston.

Debugging an AIR problem

02 Dec 2008

Here's a little narrative describing how we recently debugged an AIR problem.

Last week, we got a call from a customer who couldn't use a piece of one of our applications due to the way their proxy server worked.  Not a big deal, it was a weird setup and was the first complaint like this.  A little investigation and we ended up finding out that HTTP authentication doesn't work on Windows when opened from a secondary window.  Bug reported, workaround found.  No problem.

So we made the change, created a .air bundle, and began testing.  We noticed the application was crashing.  A lot.  Like after using the program for 10 seconds and BLAMN.  Both windows and OSX.

Another hour of investigation and we figured out this was new behavior introduced with AIR 1.5.  Bummer.  Panic sets in about now when we realize a lot of our customers are probably starting to get those "Do you want to upgrade?" messages.

As you can see from the error window, this was no usual uncaught AS3 exception.  The AIR runtime was dying hard and without any good way of us finding out why.

Since the problem occurred while interactin with an HTML control, and since we knew there were some changes that went into that with AIR 1.5, we focussed our concentration on that.  Looking through the code we weren't able to find any good leads so we began pulling parts of the application into a test app.  One by one.  To try and make a much simpler application die.  Two of us spent almost a full day on that with no success.  (We'd later find out that the one piece we didn't pull over was in fact the cause.)  We gave up on that tactic and began looking at other possible issues.

My colleague took the route of making sure the upgrade worked correctly.  After some false starts that made it look like the webkit.dll wasn't upgraded, he decided that everything thing had, in fact, upgraded correctly.

I took the route of trying to decipher the crash logs.  On OSX we got something like this:

Process:         Timeliner XE [6532]

Path:            /Users/ahatcher/Desktop/Timeliner XE.app/Contents/MacOS/Timeliner XE Identifier:      TimelinerXE.3C007187040BC93359A2293A966FB40AE2F485C8.1 Version:         1.0.46 (???) Code Type:       X86 (Native) Parent Process:  launchd [92] Date/Time:       2008-12-01 15:15:27.853 -0500 OS Version:      Mac OS X 10.5.5 (9F33) Report Version:  6 Exception Type:  EXCBADACCESS (SIGBUS) Exception Codes: KERNPROTECTIONFAILURE at 0x00000000000000b8 Crashed Thread:  0 Thread 0 Crashed: 0   WebKit.dylib                         0x22f9db14 WebKitGetAPI + 4023666 1   com.adobe.AIR                     0x01f666ba 0x19c3000 + 5912250 2   com.adobe.AIR                     0x01b6c227 0x19c3000 + 1741351 3   com.adobe.AIR                     0x01b71440 0x19c3000 + 1762368 4   ???                                      0x1a65e507 0 + 442885383 5   ???                                      0x1a65e377 0 + 442884983 6   ???                                      0x1a65e07a 0 + 442884218 7   com.adobe.AIR                     0x01b6030a 0x19c3000 + 1692426 8   com.adobe.AIR                     0x01ee3217 0x19c3000 + 5374487 9   com.adobe.AIR                     0x01ede789 0x19c3000 + 5355401 10  com.adobe.AIR                    0x01eddab3 0x19c3000 + 5352115 11  com.adobe.AIR                    0x01ee2817 0x19c3000 + 5371927 12  com.adobe.AIR                    0x01e78334 0x19c3000 + 4936500 13  com.adobe.AIR                    0x01e977ae 0x19c3000 + 5064622 14  com.adobe.AIR                    0x01ffa459 0x19c3000 + 6517849 15  com.apple.AppKit                 0x922dab8c -[NSWindow sendEvent:] + 6438 16  com.adobe.AIR                    0x01ff5a40 0x19c3000 + 6498880 17  com.apple.AppKit                 0x922a79fd -[NSApplication sendEvent:] + 4713 18  com.adobe.AIR                    0x01f9050e 0x19c3000 + 6083854 19  com.apple.AppKit                 0x92204d0f -[NSApplication run] + 847 20  com.adobe.AIR                    0x019cabbe 0x19c3000 + 31678 21  com.adobe.AIR                    0x019caed6 0x19c3000 + 32470 22  ...359A2293A966FB40AE2F485C8.1  0x000032d9 start + 4097 23  ...359A2293A966FB40AE2F485C8.1  0x000023da start + 258 24  ...359A2293A966FB40AE2F485C8.1  0x00002301 start + 41

You might have noticed, but the stack trace isn't all that readable.  "com.adobe.AIR                     0x01f666ba 0x19c3000 + 5912250" doesn't exactly tell us where in the AIR code it failed, and hell... even if it did... we didn't have the AIR runtime code to look at.  Two useful tidbits.  One was the "WebKit.dylib" line.  We were now fairly positive that the problem was in webkit, which meant it was almost certainly in the HTML control.  Also the "15  com.apple.AppKit                 0x922dab8c -[NSWindow sendEvent:] + 6438" line made me think that it might have something to do with handling a window event.

While reproducing the crash over and over I noticed something that turned out to be significant.  The more impatient I was, the more likely the app was to crash.  Later, I correlated that to moving the mouse a lot more while things were loading whenever I was impatient.  Unfortunately I didn't realize that this was significant until later.

A few more hours of frustration, and nothing.

Then I ran accros Olly.  Olly is an interactive debugger / disassembler.  It's been years since I did a lot of C work, and even longer since I've delved into that level of detail.  But it was worth a shot.  I set up Olly to be the Just In Time Debugger on my windows box.  Ran our application.  And began debugging on the crash.

Boy was I in over my head.  It's a tough tool to use and my assembly is old and outdated.  I fumbled around in it, a lot.  I tried following the stack to see what was causing the problem without luck.  I tried correlating memory addresses to methods in the DLL without luck.  I just wasn't up for this task.

See the screenshot?  Notice the lower right panel.  About the only thing I understand from the memory dump is there is the unicode string "elementFromPoint" which sounded a lot like a javascript DOM method.

It's about that time that I started putting things together.

  • Likely caused by a window event.
  • Happens during high mouse activity. (Hey, mouse activity causes window events!)
  • Occurs within WebKit
  • Might have something to do with elementFromPoint
Once I pulled those things together I remembered a feature that we tried out in the app.  As the user moused over links we wanted to show the link location in our status bar.  Much like the web browser you're probably using right now does.  Eventually, the status bar was cut so we never implemented that.  But we had left in a mouse event handler that would retrieve the URL and dispatch an event.  After removing that the bug went away.  Hooray!

A little more investigation and what I believe is happening was:  When we attempt to call a javascript method on an HTML dom document, and that dom document has been unloaded, the application crashes.  Previous versions of AIR would correctly throw an AS3 based exception that we could catch and handle.

Bug reported.

Lessons learned:

  • When features are cut, completely disable them.
  • When you get an email from Adobe about a prerelease version of AIR available, make sure to try it before it goes GM!