Saturday, July 21, 2012

Calling User code from Windows Kernel Mode

Code running in kernel mode in theory has unrestricted access to the whole address psace and so it could invoke code running in user mode.  However, doing so requires first picking a thread to run the code in, transitioning the CPU mode back to user mode, and setting up the user-mode context of the thread to reflect the call parameters.  Fortunately, calling user mode code is typically only required by OS itself and for driver, only in the context of a device IOCTL initiated by a user-mode thread.

A standard way for system execute code in the context of a given user-mode thread is to send an asynchronous procedure call (APC) to that thread.  This is how thread suspension works in Windows: the kernel simply sends an APC to the target thread and asks it to execute a function to wait on its internal thread semaphore object, causing it to beconme suspended.  APC also used in many other scenarios such as I/O completion and thread pool callback routines.

Calling Windows Kernel Mode

the most basic way to call kernel mode code from user mode application is via system call.  This mechanism uses native support in CPU to implement the trasnition.  One drawback of this mechanism is that it relies on a hard-coded table of well known executive service routines to dispatch the request from the client code to the target kernel routine.  This does not extend well to extension like drivers. 

For those cases, another mechanism called I/O control commands (IOCTL) via  the generic kernel32!DeviceControl API, is used.  The API taks the user-defined IOCTL identifier as one of its parameters and also a handle to the devfice object to dispatch the request.  The transition of kernel mode is still performed in NTDLL layer (ntdll!NtDeviceControlFile) and internally also uses the system call mechanism.  So IOCTL is a higher level communication protocol built on the top of system call.

I/O control command are processed by the I/O manager of executive which builds an I/O request packet (IRP) that it then routes to the device object requested by the user mode caller.  The device has an associated device stack that handles their requests.  The IRP will filter down the stack to give each driver a chance to either process or ignore the request.  In fact, IRP is also used by driver to send request to other drivers

Windows Kernel Layering

Windows is a monlithic system.  Drivers share the same address space as Windows kernel and has the same unrestricted access to memory and hardware.  Several part of OS such as the NT file system, TCPIP stack etc are implemented as drivers rather than included in the kernel binary.

Above the HAL layer is the Window kernel (ntoskrnl).  Kernel implements core low-level OS services such as thread scheduling, multiprocessor synchronization and interrupt/exception dispatching.  It also contains a list of routine that are used by the executuve layer to expose higher level semantics to user mode applicaiton.

The executive is also hosted in the same kernel module (ntoskrnl).  It performs core services such as proces/thread management and I/O dispatching.  Other functions include security reference montior, plug and play manager, power management, cache and memory manager.  Executive exposes callable function to other components such as driver and to user mode, called system services.  The typical entry point in user mode is via ntdll.dll (which contain instruction to perform context switching to kernel mode).  Executive allows process in kernel mode to access kernel objects such as process, thread, event etc via an object called handle, kept tracked in a handle table for each process.

The Win32 UI and graphics services are implemented by an extension to the kernel (win32k.sys) and exposes system services for application via entry point in user32.dll.

Several core facilities of the Windows OS are primarily implemented in user mode instead of kernel mode.

User sessions in Windows represent resource and security boundaries and offer a virtualized view of the keyboard, mouse and display to support multiuser logon on the same OS.  The state that back these sessions is tracked in a kernel-mode virtual address space called session space.  In user mode, the session manager subsystem process (smss.exe) is used to start and manage these user sessions.

During Windows boot upm a leader smss.exe instance that is not assocaiated wth any session gets created.  this leader instance creates a copy of itself for each new session, which then starts the winlogon.exe and csrss.exe for each sesison.  Use of multple smss.exe instance can provide faster logon of multiple users on Windows server acting as terminal servers.

Winlogon.exe is responsible for manageing user logon and logoff.  In particular, this process starts the graphic UI process that display the logon screen when the user presses the CTL-ALT-DEL keys and display the desktop after successful login.  Each session has its own instance of winlogon.exe process.

Csrss.exe or the client serverrintime subsystem process is the user model part of win32.sys (UI subsystem).  It runs the UI message loop of console application prior to Win7.  Each session has its own instance of csrss.exe.

The local security authority subsystem (lsass.exe) is used by winlogon.exe to authenticate user accounts during logon.  It generates a security token object to represent the user's security rights which then used to create the explorer process for the user session.  New child processes created from the explorer shell inherit their access tokens from the initial explorer process security token.  There is only one instance of lsass which runs in the noninteractove session (known as session 0).

Services.exe or the NT service control manager (SCM) runs in session 0.  It is responsible to start a class of special user mode processes called Windows services.  These processes are typically used to carried out background tasks that do not required terminal interaction.  These processes can choose to run with the highest privileges in windows (LocalSystem account), so they often used to perform privilegded tasks on behalf o user mode application.

Processes run with LocalSystem account (which is the highest privileged account in Windows) are parts of the trusted computing base (TCB) which are able to bypass any check by the security subsystem in the OS.