Here's a little narrative describing how we recently debugged an AIR problem.
Last week, we got a call from a customer who couldn't use a piece of one of our applications due to the way their proxy server worked. Not a big deal, it was a weird setup and was the first complaint like this. A little investigation and we ended up finding out that HTTP authentication doesn't work on Windows when opened from a secondary window. Bug reported, workaround found. No problem.
So we made the change, created a .air bundle, and began testing. We noticed the application was crashing. A lot. Like after using the program for 10 seconds and BLAMN. Both windows and OSX.
Another hour of investigation and we figured out this was new behavior introduced with AIR 1.5. Bummer. Panic sets in about now when we realize a lot of our customers are probably starting to get those "Do you want to upgrade?" messages.
As you can see from the error window, this was no usual uncaught AS3 exception. The AIR runtime was dying hard and without any good way of us finding out why.
Since the problem occurred while interactin with an HTML control, and since we knew there were some changes that went into that with AIR 1.5, we focussed our concentration on that. Looking through the code we weren't able to find any good leads so we began pulling parts of the application into a test app. One by one. To try and make a much simpler application die. Two of us spent almost a full day on that with no success. (We'd later find out that the one piece we didn't pull over was in fact the cause.) We gave up on that tactic and began looking at other possible issues.
My colleague took the route of making sure the upgrade worked correctly. After some false starts that made it look like the webkit.dll wasn't upgraded, he decided that everything thing had, in fact, upgraded correctly.
I took the route of trying to decipher the crash logs. On OSX we got something like this:
Process: Timeliner XE You might have noticed, but the stack trace isn't all that readable. "com.adobe.AIR 0x01f666ba 0x19c3000 + 5912250" doesn't exactly tell us where in the AIR code it failed, and hell... even if it did... we didn't have the AIR runtime code to look at. Two useful tidbits. One was the "WebKit.dylib" line. We were now fairly positive that the problem was in webkit, which meant it was almost certainly in the HTML control. Also the "15 com.apple.AppKit 0x922dab8c -[NSWindow sendEvent:] + 6438" line made me think that it might have something to do with handling a window event.
Path: /Users/ahatcher/Desktop/Timeliner XE.app/Contents/MacOS/Timeliner XE Identifier: TimelinerXE.3C007187040BC93359A2293A966FB40AE2F485C8.1 Version: 1.0.46 (???) Code Type: X86 (Native) Parent Process: launchd  Date/Time: 2008-12-01 15:15:27.853 -0500 OS Version: Mac OS X 10.5.5 (9F33) Report Version: 6 Exception Type: EXCBADACCESS (SIGBUS) Exception Codes: KERNPROTECTIONFAILURE at 0x00000000000000b8 Crashed Thread: 0 Thread 0 Crashed: 0 WebKit.dylib 0x22f9db14 WebKitGetAPI + 4023666 1 com.adobe.AIR 0x01f666ba 0x19c3000 + 5912250 2 com.adobe.AIR 0x01b6c227 0x19c3000 + 1741351 3 com.adobe.AIR 0x01b71440 0x19c3000 + 1762368 4 ??? 0x1a65e507 0 + 442885383 5 ??? 0x1a65e377 0 + 442884983 6 ??? 0x1a65e07a 0 + 442884218 7 com.adobe.AIR 0x01b6030a 0x19c3000 + 1692426 8 com.adobe.AIR 0x01ee3217 0x19c3000 + 5374487 9 com.adobe.AIR 0x01ede789 0x19c3000 + 5355401 10 com.adobe.AIR 0x01eddab3 0x19c3000 + 5352115 11 com.adobe.AIR 0x01ee2817 0x19c3000 + 5371927 12 com.adobe.AIR 0x01e78334 0x19c3000 + 4936500 13 com.adobe.AIR 0x01e977ae 0x19c3000 + 5064622 14 com.adobe.AIR 0x01ffa459 0x19c3000 + 6517849 15 com.apple.AppKit 0x922dab8c -[NSWindow sendEvent:] + 6438 16 com.adobe.AIR 0x01ff5a40 0x19c3000 + 6498880 17 com.apple.AppKit 0x922a79fd -[NSApplication sendEvent:] + 4713 18 com.adobe.AIR 0x01f9050e 0x19c3000 + 6083854 19 com.apple.AppKit 0x92204d0f -[NSApplication run] + 847 20 com.adobe.AIR 0x019cabbe 0x19c3000 + 31678 21 com.adobe.AIR 0x019caed6 0x19c3000 + 32470 22 ...359A2293A966FB40AE2F485C8.1 0x000032d9 start + 4097 23 ...359A2293A966FB40AE2F485C8.1 0x000023da start + 258 24 ...359A2293A966FB40AE2F485C8.1 0x00002301 start + 41
While reproducing the crash over and over I noticed something that turned out to be significant. The more impatient I was, the more likely the app was to crash. Later, I correlated that to moving the mouse a lot more while things were loading whenever I was impatient. Unfortunately I didn't realize that this was significant until later.
A few more hours of frustration, and nothing.
Then I ran accros Olly. Olly is an interactive debugger / disassembler. It's been years since I did a lot of C work, and even longer since I've delved into that level of detail. But it was worth a shot. I set up Olly to be the Just In Time Debugger on my windows box. Ran our application. And began debugging on the crash.
Boy was I in over my head. It's a tough tool to use and my assembly is old and outdated. I fumbled around in it, a lot. I tried following the stack to see what was causing the problem without luck. I tried correlating memory addresses to methods in the DLL without luck. I just wasn't up for this task.
It's about that time that I started putting things together.
- Likely caused by a window event.
- Happens during high mouse activity. (Hey, mouse activity causes window events!)
- Occurs within WebKit
- Might have something to do with elementFromPoint
- When features are cut, completely disable them.
- When you get an email from Adobe about a prerelease version of AIR available, make sure to try it before it goes GM!