Saturday, July 20, 2013

Interrupt

Interrupt service routine (ISR) or interrupt handler is triggered to handle the event.  In real mode, the first 1K address 0x000 to 0x3FF contains the IVT (interrupt vector table).  In protected mode, the structure is called IDT (interrupt descriptor table).  Both IVT and IDT map interrupts to the ISRs.

In real mode, IVT stores the logical address of each ISR sequentially.  Each entry is 4 bytes - 2 for the segment selector and 2 for the effective address.  The IVT contains 256 entries.

Under MSDOS, the BIOS handle interrupt 0-31.  DOS system calls map to interrupts 32-63.  The remaining 64-255 interrupts are user defined.

There are 3 types of interrupts:
(1) hardware interrupts (external interrupts) are generated by external devices.  They are either maskable or non-maskable.  Maskable interrupts can be disabled by clearing the IF flag using the CLI instruction.  Non-maskable interrupt cannot be ignored and will always be handled by the processor.

(2) Software interrupts (internal interrupts) are implemented in programs using INT instruction.  INT takes an integer operand which represent the interrupt vector to invoke.  INT clears the TF (Trace Flag) and IF (no tracing and disable interrupt while executing), pushes FLAGS, CS, IP onto the stack (save the state and return address), jump to the ISR until IRET.

(3) Exceptions are generated when processor detects an error when execute an instruction.  There are 3 types of exception which differ in how the error is reported and how the instruction is restarted.  When a fault occurs, the processor reports the exception at the boundary preceding the offending instruction.  In other words, the state is reset to allow the instruction to restart.  Interrupt 0 (divided by zero) is an example of a fault.  When a trap occurs, no instruction restart is possible.  The processor report it at the boundary preceding the next instruction.  Example of traps are 3 (breakpoint) and 4 (overflow).  When an abort occurs, the program cannot be restarted.

In protected mode, the IDT stores an array of 64-bit gate descriptors.  These gate descriptor can be interrupt gate, trap gate or task gate.

Unlike IVT, IDT exists in any location in the linear address space.  The 32-bit base address of the IDT is stored in the 48-bit IDTR register (position 16 to 47).  The size of IDT (in bytes) stored in but 0 to 15.  IDTR can be manipulated by LIDT and SIDT instructions.  Reference beyond the IDT size limit will generate a general-protection exception.  As in real mode, the maximum number of IDT entries is 256.  Entry 0 to 31 is reserved by IA-32 processor for various interrupts and exceptions.

Gate descriptors allow programs to access code segments with different privilege levels.  Gate descriptors are system descriptor (with S-flag cleared).  The types of gate descriptor are encoded in the TYPE field.  Gates can be 16-bit or 32-bit.  This allows the systems to determine if the stack push is 16- or 32-bit variant.

Call Gate Descriptors live in GDT.  Instead of storing 32-bit base linear address line a code or data segment, it stores a 16-bit segment selector and 32-bit offset address.  The segment selector references a code segment in the GDT.  The offset address points to the entry point of the linear address of the procedure in the segment.  In effect, it is a descriptor in GDT points to another descriptor (via selector) in GDT points to a code segment (then applies the offset address).

To jump to a new segment using a call gate have 2 conditions:
(1) CPL of program and RPL of the selector for the call gate <= DPL of the call gate descriptor
(2) CPL of program >= DPL of the destination code segment

Interrupt gate and trap gate descriptors behave like call gate, except they reside in the Interrupt Descriptor Table (IDT).  The segement selector specified a code segment in GDT.  The effective address points to the entry point of the service routine in the segment.  So both descriptor ends up in GDT.  The only difference between interrupt gate and trap gate is that processor will clear the IF in EFLAGS when access bis interrupt gate.  For trap gate, IF value remains.

For security check, CPL of program invoking the handler must be less than or equal to the DPL of the gate.  This condition only holds when the handling routine is invoked by software (e.g INT).  The DPL of the segment selector points to the code segment must be less of equal to the CPL

Sunday, July 14, 2013

Protection through Segmentation

Checks are perform during logical to linear address translation when segmentation is enabled.

(1) Limit check uses the 20-bit limit field to ensure program does not access memory beyond the segment,  The processor also check the limit field in GDTR to ensure the segment selectors do not access entries beyond the GDT.

(2) Type check uses the S-flag and Type field to ensure the proper type is use.  For example, CS can only be loaded with code segment.  Access to the null descriptor will generate a general protection exception.

(3) Privilege check used privilege levels.  Current Privilege Level (CPL) is the RPL in the CS or SS register used by executing program.  CPL can be changed via a far call or jump instruction.  Privilege check happens when segment selector associated with segment descriptor is loaded.  This happens when program access data in another code segment or pass control to another segment.  Privilege violation generates a general protection exception.

To access data in another data segment, the selector must be loaded into the SS or one of the data segment (DS, ES, FS, GS).  To load selector into CS, it can only be done via instructions like JMP, CALL, RET, IRET, SYSENTER and SYSEXIT.

To access data in another segment, the DPL of target segment must be same or higher than CPL and RPL.

To load the stack segment register, both DPL of the stack segment and the corresponding RPL must be same as CPL.

When transferring control to a nonconforming code, the CPL must be equal to the DPL of destination segment.  In other words, the privilege level must be equal ob both sides of the fence.  In addition, the RPL of the selector for the destination segment must be less than or equal to the CPL.  Nonconforming code cannot be accessed by program with less privilege.

When transferring control to conforming code, the calling code's CPL must be greater than or equal to the DPL of the destination code.  RPL is not checked in this case.

(4) Restricted instruction check verify the program does not use privileged instruction like LGDT, LIDT, MOV a value to control register, HLT the processor, write to model specific register WRMSR etc.

Write Protection

CR0 16th bit is the WP bit. When WP is set, supervisor mode code cannot write to user pages. This mechanism is used to implement copy-on-write used by UNIX in creating process.

PDE and PTE

Both PDE and PTE are 32-bits in length.

The higher order 20 bits (12 to 31) contains the base address of the PTE or page. The address is expanded to 32 bits implicitly by adding trialing 12 zeros.

Avail field (bit 9 to 11) indicat if the entry is available for OS use. 

Global (G) flag (bit 8) is ignored in PDE. In PTE, it help to keep frequently accessed pages from flushing out of TLB

Bit 7 in PDE represents page size. When clear, 4KB page is used. In PTE, the bit represents the Page Attribute Table (PAT).

Bit 6 is clear in PDE. In PTE, it indicate if the page is dirty (written to)

Access (bit 5) indicates if the page has been accessed recently (both read or write)

PCD (bit 4) is the page cache disabled flag. When set, the page or page table will not be cached.

PWT (bit 3) is the page write through flag. When set, page write through is enabled for this page or page table 

U/S (bit 2) indicates if the page has user or supervisor privilege 

R/W (bit 1) specifies the protection for this page. Set means R/W and clear means R/O for the page the entry points to

P (bit 0) is the Present bit which indicates if the page or page table is loaded in memory currently (set)

Saturday, July 13, 2013

Protected Mode Paging

Without paging enabled, the linear address (formed by translating a logical address used in program via the segmentation process) is a physical address.

With paging, the linear address goes through another round of translation to form the final physical address.

The high order bits (22 to 31) in the linear address index into a page directory. The physical address of the page directory is stored in CR3 (known as PDBR or page directory base register). As there are 10 bits, the number of entries in the page directory is 1024. Each page directory entry (PDE) contains the physical address of a page table.

Bit 12 to 21 specifies a particular entry in the page table. Again as the field is 10-bits in length, the number of PTE in each page table is 1024. Each PDE stores the physical address of the page in memory. The total size of the memory space representation by paging using 4K pages is 4G = 1024 x 1024 x 4.

If Physical Address Extension (PAE) is enabled, the address space size would expanded to 64GB. PAE adds another data structure to the address translation process. PAE was introduced in Pentium Pro. PAE increased the address lines of the processor from 32 to 36.

Protected Mode Segmentation

In Real Mode, the segment registers contain a segment selector which is the base address of the segment.

In protected mode, the segment selector points to a specific entry in a table.

There are 2 types of descriptor table - GDT and LDT. There is only one GDT shared by all tasks in the entire system. The LDT can be used by one or one group of tasks. GDT is located by using a special register GDTR. It is manipulated by privileged instruction executable by OS.

Each segment register pairs with an invisible part called the segment cache register, which contains the content of the corresponding 8-bytes table entry (called descriptor) in GDT or LDT.

The selector is 16-bits in length. The highest 13 bits specifies an entry in the descriptor table. In other words, there is 8K entries in the descriptor table. GDT (Global Descriptor Table). The next bit indicate if the table is GDT or LDT. The last 2 bits indicate the privilege level of the selector.

The descriptor is 64-bits in length and contains
1. base address of the segment (32-bits)
2. size of the segment (20-bits). The G-flag (1-bit) is used to interpret the size (clear means the size is number of byte from 1 byte to 1M, set means the size is 4K increment from 4K to 4M)
3. S-flag indicate if it is a system segment (clear) or an application segment (set). System segment descriptors are used to jump to segments that have higher privilege that the current executing task
4. Type (4-bits) used with S-flag to further define the segment. If the descriptor is an application section, bit 11 defines if it is code (set) or data (clear). For data segment, bit 10/9/8 represent the direction of growth (clear = up and set = down), RO/RW and if it is recently accessed respectively. For code segment, the last 3 bits represent if the code is conforming or not (set), Execute-only or Execute/READ and if it is recently accessed. A non-conforming code segment cannot be accessed by a program that is executing with less privilege (higher DP value). In other words, RPL <= CPL <= DPL
5. DPL (descriptor privilege level) 
6. P-flag defines if the segment is currently in memory (set)
7. AVL defines if the segment is available for OS use 
8. L-flag defines if the segment contains 64-bit code. Most IA-32 processors clear this bit.
9. D/B flag means differently when the segement is code, data or stack.

 The first descriptor entry is always empty called null segement descriptor and the selector pointing to this entry is call null selector.

There are other types of descriptor in GDT:
1. Task State Segment (TSS)
2. Local Descriptor Table (LDT) 
3. Code, data or stack memory segment to be accessible by multiple task
4. Procedure call gate used to control access to privilege program (e.g. IO routine) by less privileged ones (user)
5. Task gates used to switch to other task. LDT (Local Descriptor Table) is accessed via GDT (see 2).

The 16-bit segment selector is stored in TSS so that it could be loaded at task switching. TSS is a memory area that keeps the context of a task when it is switced out.  It contains the general register, the segment register, the LDT selector field, EFlag, EIP, ESP, CR3 (Page Directory Address) etc. When a user program (privilege level 3) called into a more privileged program (level 0 to 2), the processor also automatically create a new stack. Therefore, TSS also keeps 3 additional ESP to record the stack top for each level.

Thursday, July 11, 2013

Real Mode Segmentation

Real Mode environment is based on 8086/88 processors.  There are 6 segment registers, 4 general purpose registers, 3 pointer registers, 2 index registers and a flag register.  All registers are 16-bit

The first 4 segment registers (CS, DS, SS and ES) store segment selectors which is the first half of a logical address. FS and GS came after 8086/88.

CS stores the base address of the current executing code segment
DS stores the base address of segment storing global data
SS stores the base address of the stack segment
ES stores the base address of segment for string data
FS and GS stores the base address of 2 more segment for global data

The 3 pointer registers are IP (for instruction), SP (stack pointer) and BP used to build stack frames for function calles

The 4 GPR are

AX = accumulator used for arithmetic functions
BX = base register used as index to address memory indirectly
CX = counter often used in loop
DX = data register used with AX

The 3 index registers are

SI = points to address of source in string operation
DI points to address of destination in string operation

Real mode use segmentation to manage memory. Jump operation needs to differentiate if the jump is within segment (NEAR) or across segments (FAR).  There are several instruction resulted in jump.  NEAR and FAR jump are relocation which means that they do not depend on specific address in the binary encoding

INT and IRET are intrinsically far jump as both of them involve the segment selectors.

JMP and CALL can be near or far depends on how they are invoked.

JMP SHORT label
JMP FAR PTR label
JMP DX is a NEAR indirect jump
JMP DS:[label] is a FAR direct jump
JMP DWPRD PTR [BX] is a FAR indirect jump

CALL label is a NEAR jump
CALL BX is a NEAR indirect jump
CALL DS:[label] is a FAR direct jump
CALL DWORD PTR [BX] is a FAR indirect jump
RET is a NEAR return
RETF is a FAR return