Sunday, August 26, 2012

Windows Debugging

User mode debugging is supported by OS using debug port object.  Notificiation message generated by the target process is passed to the debug port and the debgger process process them.

In order for user mode debugger to break-in another process, it must have sufficient privilege.  That's why one can only debug its own process unless one attains full administrator privilege (SeDebugPrivilege).

When an exception occurs, the OS exception dispatching code in ntdll.dll notifies the debugger before any user exception handler.  This is called first chance exception notification.  The debugger can choose to handle or ignore the exception.  If it choose to ignore, control passes to the user exception handler.  If the SEH exception was not handled by a thread in the target process, the debugger is sent another debug event called second chance exception notification to inform that the exception is not handled.

Default action for first chance notification is to logged in the debugger window.  The second chance notification default action is always to stop the target process because unhandled exception is a common error.

Debugger can freeze the target process anytime which refer as breaking into the debugger.  This is achieved by using the DebugBreakProcess API, which internally injects a remote thread into the target process.  This "break in" thread execuites a debug break CPU interrupt instruction (int 3) and OS responds by raising an SEH exception and send a first chance notification to the debugger.  The debug break exception (x80000003 or STATUS_BREAKPOINT) suspend all threads in the target process.  The current thread context om the user mode debugger after a break-in operation will be in this special thread (ntdll!DbgUiRemoteBreakin).  Once in the debugger, one can switch to the actual thread to inspect the content.

Code breakpoints are also implemented using int 3 instruction directly overwriting the target code in memory (kernel32!WriteProcessMemory).  The debugger keeps track of the initial instruction for each code breakpoint so that it can substitute them in place of the debug break instructure when the breakpoints are reached. and before the user can inspect the target process in the debugger.  This makes the break transparent to the user.  To reinsert the break after the user issue g (go) to resume exection, the debugger uses the TF (Trap Flag) in EFLAGS register which force the target thread to execute on instruction at a time.  This single-step flag causes the CPU to issue an interrupt (int 1) after each instruction.  The target process execute the original instruction and then the debugger gets the chance to handle the SEH.  The debugger restore the breakpoint instruction again and reset the TF flag to disable single-step mode.

In kernel mode debugging, the communication channel is buillt lower down in the architectural stack using HAL extensions.  When kernel debugging is enabled via msconfig.exe and the target machine is rebooted, the OS boot loader loads the appropriate debug transport extension module at the early part of boot sequence (around the time when HAL is being loaded):

kdcom.dll for serial COM cable
kd1394.dll for firewire cable
kdusb.dll for USB 2.0 debug cable

The target machine periodically poll for break-in message from the host debugger (nt!kdCheckForDebugBreak). When the message is received, the OS will suspend all process and enter break-in state.

Because these modules sit low in the architectural stack, they cannot depend on higher level OS kernel components that may not have fully loaded or they themselves are the targets of debug.  The OS kernel periodically asks the tranport layer (as part of the clock ISR) to check for break-in packets from the host debugger.  While the target computer is suspended, the break-in loop checks for more commands (e.g. inspect register) sent by the host kernel debugger.  Another way to enter the break-in loop is when the target machine hits an exception.

Like the user mode debugger, breakpoints are inserted by overwriting the target code with int 3.  Once the breakpoint is reached, the target machine enter the break-in send/receive loop to allow the debugger to put the intial byte back in the breakpoint location before entering the break-in state. 

If the code has been paged out, the target machine register the code breakpoint as "owed".  When the code page is loaded into memory later, the page fault handler (nt!MmAccessFault) in the kernel memory manager intervenes and insert the breakpoint into the global code page at that time.

Managed code,MSIL (NET), is translated into machine code lazily when the method is actually invoked.  Therefore, when debugger wants to insert a breakpoint, it needs to wait.  The native debug events generated by OS are not sufficient to support MSIL debugging as only CLR knows when a method are compiled or how the class is represented in memory.  As such, the CLR provide a dedicated helper thread, known as debugger runtime controller (clr!DebuggerRCThread), to interact with debugger.  Even in the break-in state, the managed target process is not frozen because the debugger thread still runs to service the commands from debugger. 

A set of COM objects provide via mscordbi.dll provide the interface to the helper thread.  Debugger like the visual studio debugger accepts user command via its front end UI (vsdebug.dll), forward them to the backend (cpde.dll) which in turn uses mscordbi.dll to communicate with the helper thread in the managed target process.  These COM objects take care the private interprocess comminication channel between the debugger and target processes.  This architecture has a drawback is that it cannot debug crash dump as there is no active helper thread running.

Active Scripting specification defines a language processing engine with the Active Scripting host using that engine when the script needs to be interpreted.  Examples of scriot engine are vbscript.dll and jscript.dll (MS implementation of JavaScript).  Example of Active Scripting hosts ionclude the IIS (server-side scripts embedded in ASP or ASP.NET pages), IE (client side script hosting in web pages) and the Window scripting hosts (cscript.exe or wscript.exe).  The Active Scripting specification also defines a contract (a set of COM interfaces) for debugger.  Host that support debugging (i.e. implement the COM interfaces) are called smart host.  A Process Debug Manager (PDM) component (pdm.dll) is shipped with the Visual Studio debugger to insulate script engine from the intricacies of script debugging.  PDM serves the same purpose as the debugger controller thread in CLR.  The debugging services in script engine is usually not exposed by default and must be turned on (e.g. via IE option).

No comments: