WinDbg for .NET developers: "2. The Hunt for Loot"

Ideal crash dump analysis should be quite straightforward: we open WinDBG, load crashdump, load sos, enter !PrintException and obtain all information which we need. And from this prospective we can even use something easier like Visual Studio. Unfortunately, in real and cruel world this scenario can be complicated with a bunch of legacy code which was written ages ago by developers who thought about things such as quick sell of their product and did not think about anybody who would maintain it in the future. Second part "2. The Hunt for Loot" is dedicated to revealing some of techniques which can be used against the commonly made mistakes in exception handling, how they can complicate your life and how you can overcome them.

Disclaimer

The title of this post is an ugly lie: the bigger part of a post is dedicated to shy procdump and only a small part is dedicated to glorious WinDBG. And like everything in this blog, this post can be absolutely cruel and contain terrible things. Some of them can be made in absolutely incorrect and irrational way and even destroy your brain. If you have another opinion or know more logical way to do something relevant to the current topic, I will be very glad if you express your thoughts in comments or pull-requests.

Boss Examination

Like in the previous post you can use any real .NET application for the experiments if you want to complicate your life. Otherwise you can use special demo application prepared for this course.

In ideal world you can get the most usefull crashdump simply:

  • Start procdump, e.g. procdump -ma -e WinDbgCourse.exe
  • Reproduce crash scenario, e.g. by clicking Basic Null Reference Crash.
  • Get memory dump and analize it.
  • Bingo! You know root cause of problem.

Custom Error Handled Window

Unfortunately, universe does not like such simplicity and will try to make so many challenges as it can. For example, some applications want to be more user-friendly and provide some custom error window which does not provide a bunch of useless information (of course for an end-customer). Like it (you can get it by clicking Incorrectly Handled Crash button):

/posts/2018/windbg_hunt/custom_window.png

I am sure that you already suspect something bad.. Yes, application is closed and you get nothing:

C:\bin\procdump>procdump.exe -ma -e WinDbgCourse.exe

...

[10:09:36] Exception: 0000071A
[10:09:36] The process has exited.
[10:09:36] Dump count not reached.

Of course it is a consequence of incorrect UnhandledException implentation and if we try same steps but with Correctly Handled Crash button we get another result:

C:\bin\procdump>procdump.exe -ma -e WinDbgCourse.exe

...

[10:09:36] Exception: C0000005.ACCESS_VIOLATION
[10:09:36] Unhandled: C0000005.ACCESS_VIOLATION
[10:09:36] Dump 1 initiated: C:\bin\procdump\WinDbgCourse.exe_180719_100936.dmp
[10:09:36] Dump 1 writing: Estimated dump file size is 247 MB.
[10:09:36] Dump 1 complete: 247 MB written in 0.3 seconds
[10:09:36] Dump count reached.

Fortunately, not everything is lost and you can collect dump and extract some profit even from this window if you try collect without closing the window with a little bit shorter variant of command procdump -ma WinDbgCourse.exe (output will be almost same for both above cases):

C:\bin\procdump>procdump.exe -ma WinDbgCourse.exe

...

[10:09:36] Dump 1 initiated: C:\bin\procdump\WinDbgCourse.exe_180719_100936.dmp
[10:09:36] Dump 1 writing: Estimated dump file size is 239 MB.
[10:09:36] Dump 1 complete: 239 MB written in 0.3 seconds
[10:09:36] Dump count reached.

Due to in common case you cannot determine what universe tries to hide from you I always recommend to use the last variant of procdump command when you see such window.

Strange Things

Another interesting trick from the universe is to face you with some WTF cases when strange things happen, but very unstable and nobody can understand what the f* is going on. In my experience a major part of such cases happen due to the following code:

try
{
   // ...
}
catch
{
   // Empty catch block... I love it!
}

If strange things happen when procdump is run (procdump -ma -e WinDbgCourse.exe) you can see something like this (click Unreliable Message Box button):

C:\bin\procdump>procdump -ma -e WinDbgCourse.exe

...

Press Ctrl-C to end monitoring without terminating the process.

[10:09:36] Exception: E0434352.CLR
[10:09:36] Exception: E0434352.CLR

Hm.. it looks like shi... not very useful. But shy procdump also has a great power! Output can be made much more beautiful with arguments -e 1 -f "" which adds an exception messages to output:

C:\bin\procdump>procdump -ma -e 1 -f "" WinDbgCourse.exe

...

Press Ctrl-C to end monitoring without terminating the process.

CLR Version: v4.0.30319

[10:09:36] Exception: E0434F4D.System.Exception ("Make somebody life more interesting!")
[10:09:36] Exception: E0434F4D.System.Exception ("Make somebody life more interesting!")
[10:09:36] Exception: E0434F4D.System.Exception ("Make somebody life more interesting!")

It looks like some bastard jeers at us, but let us not run ahead and collect memory dump from this issue. For that we should use arguments -e 1 -f "interesting" and procdump will catch the first chance exception which contains string interesting:

C:\bin\procdump>procdump -ma -e 1 -f "interesting" WinDbgCourse.exe

...

Press Ctrl-C to end monitoring without terminating the process.

CLR Version: v4.0.30319

[10:09:36] Exception: E0434F4D.System.Exception ("Make somebody live more interesting!")
[10:09:36] Dump 1 initiated: C:\bin\procdump\WinDbgCourse.exe_180719_100936.dmp
[10:09:36] Dump 1 writing: Estimated dump file size is 220 MB.
[10:09:36] Dump 1 complete: 220 MB written in 0.3 seconds
[10:09:36] Dump count reached.

We got memory dump and ready to investigate it! But we will do it in the next chapter.

Incorrect Handling of Nested Exceptions

Here we can beg universe to stop but it has another trick: incorrect handling of nested exceptions. Developers in ideal world always forward old exception when throw a new exception in its constructor and make them accessible through InnerException. Developers in real world... can be much more cruel and write code like in handler of Incorrect Nested Exception button. Sometimes there is exist chance to get this exceptions, sometimes -- not, I will show the some trick in the next chapter. Right now I suggest to collect two memory dump after clicking Incorrect Nested Exception and Correct Nested Exception. The second one is absolutely necessary because you should understand how you can simplify someone’s life if will handle nested exception correctly.

Life Fast, Die Young

Oh universe, thou art a heartless bitch! A next trick from universe is "mortality at applications starts". Sometimes application dies so young that they have not any chance to write some logs. Especially it is thorn in one’s side when we try to understand why Windows cannot start a service application. In these cases argument -w which runs procdump in wait mode can help and it works quite good for a big amount of cases. You can try it if you start if you run demo application in console shell with --crash args:

C:\bin\procdump>procdump -ma -e -w WinDbgCourse.exe

...

Waiting for process named WinDbgCourse.exe...

...

Press Ctrl-C to end monitoring without terminating the process.

[10:09:36] Exception: C0000005.ACCESS_VIOLATION
[10:09:36] Unhandled: C0000005.ACCESS_VIOLATION
[10:09:36] Dump 1 initiated: C:\bin\procdump\WinDbgCourse.exe_180716_100936.dmp
[10:09:36] Dump 1 writing: Estimated dump file size is 157 MB.
[10:09:36] Dump 1 complete: 157 MB written in 0.3 seconds
[10:09:36] Dump count reached.

Unfortunately, some applications want to live even faster and procdump has no chance to collect dump at all. You can try it if you you run demo application in console shell with --fast-crash args. Quite rarely it can works but usually you get something like that:

Waiting for process named WinDbgCourse.exe...

Error debugging process:
Access is denied. (0x00000005, 5)

But even for this case we have answer -- -w -ma -e -x <target_dir> arguments... It is a much more complicated than arguments before but it works because we run application right under procdump:

C:\bin\procdump>procdump.exe -w -ma -e -x C:\bin\procdump C:\bin\windbg-course\WinDbgCourse.exe --fast-crash

...

Press Ctrl-C to end monitoring without terminating the process.

[10:09:36] Exception: 04242420
[10:09:36] Exception: C0000005.ACCESS_VIOLATION
[10:09:36] Unhandled: C0000005.ACCESS_VIOLATION
[10:09:36] Dump 1 initiated: C:\bin\procdump\WinDbgCourse.exe_180719_100936.dmp
[10:09:36] Dump 1 writing: Estimated dump file size is 157 MB.
[10:09:36] Dump 1 complete: 157 MB written in 0.2 seconds
[10:09:36] Dump count reached.

If you have a service application that behave like this... well, my condolence. You can try to run procdump in wait mode again and again and try to start service again and again until you will understand that implementation of --no-daemon mode to your application would be a good idea.

Is it... all?

I hope that yes and I am sure that it is not. So to prepare for other universe tricks I recommend to run procdump -h and read about all supported modes and features of procdump. Like other command-line tools it is very well self-documented and it should be enough almost for a big amount of real life cases. If you want something more modern and with examples than Microsoft Docs Portal is good place to continue. At least I recommend you to read about -h -c -m arguments which were not described in the post.

Loot Examination

If you attentively follow the plot of the previous chapter you could collect quite big amount of dumps:

  1. Post-mortem dump from Basic Null Reference Crash button.
  2. Post-mortem dump from Correctly Handled Crash button.
  3. Almost post-mortem dump from Incorrectly Handled Crash button.
  4. Almost post-mortem dump from Correctly Handled Crash button.
  5. Empty-catch-clause memory dump from Unreliable Message Box button.
  6. Memory dump from Incorrect Nested Exception button.
  7. Memory dump from Correct Nested Exception button.
  8. Post-mortem dump from --crash command.
  9. Post-mortem dump from --fast-crash command.

Almost all of them are quite uninterested for dump analysis: it is quite straightforward and can be done with !PrintException and the simple technique from the previous post. The main reason why I describe these scenarios in this post is to reveal some useful techniques of dump collecting which can help you in-field debugging.

All non post-mortem dumps (3, 4, 6, 7) have one common thing -- WinDBG does not show standart header for WIN32 exception because execution is still in CLR side:

** Unhandled exception: C0000005.ACCESS_VIOLATION'

...

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.

But because !PrintException uses CRL exception boundaries it works perfectly here.

For 7-th memory dump where nested exception is forwarded to constructor WinDBG suggest clickable hint for you (you really should understand how it is easy and cool):

0:000> !PrintException
Exception object: 032dad4c
Exception type:   System.Exception
Message:          The first exception is not interesting, so we can use another one
InnerException:   System.NullReferenceException, Use !PrintException c87ea138 to see more.
...

0:000> !PrintException /d 032da9f0
Exception object: 033f1b68
Exception type:   System.NullReferenceException
Message:          Object reference not set to an instance of an object.
InnerException:   <none>

Of course SOS extension can also contain bugs so address of excepiton in the hint and in the real command are different... but fortunately it still works.

Much worse situation with 6-th dump. Nested exception handling is implemented incorrectly here and outer exception does not contain an inner exception:

0:000> !PrintException
Exception object: 033f1c58
Exception type:   System.Exception
Message:          The first exception is not interesting, so we can use another one
InnerException:   <none>

There is only one universal way to get inner exception here -- reproduce a scenario and use -e 1 -f "<mask>" arguments. But if dump is already collected and a customer is not able to make it again (maybe because they are in rage) there is the hack which usually works based on some tricky factors:

  • Almost all exceptions have a common part Exception in their class name.
  • Inner exception is usually placed quite close to an outer exception in a heap.
  • Inner exception is not referenced from other objects so it is marked as dead but because it happens a very short time ago GC has not chance to collect it.

If all factors are aligned and we are quite lucky we can find an inner exception in heap:

0:000> !dumpheap -type Exception -short -dead
032da9f0
032dad4c
033f1b68

0:000> !PrintException 033f1b68
Exception object: 033f1b68
Exception type:   System.NullReferenceException
Message:          Object reference not set to an instance of an object.

Bingo! It is absolutely the same that we have seen before! But you should understand that it is a big luck and you should not have hope that it will work for you.

Conclusion

I agree that it is unfair to lie in the post title and hide more power weapons in the future parts... But you should also agree that you can be not ready for real power of WinDBG and it can be fatal for you. And that good memory dump sometimes is even more important that advanced techniques of dump analysis and even with this arsenal you will be able to solve a lot of interesting in-field debugging puzzles.

Comments

Comments powered by Disqus