Record: August 2011

Monday, August 29, 2011

Broadband

Most LAN technology uses baseband signalling which mean only one signal is transmitted through the medium at a time. Broadband signaling on the other hand allows multiple signals on the medium simultaneously.

Sunday, August 28, 2011

Layer 1 Devices

Repeaters are the most basic form of forwarding device. They associate with Layer 1 as they do not have the ability to inspect the content of the headers. They simply regenerate the exact copy of the electrical signal. They are primarily used to extend the distance of cable run. They are seldonly used now due to advance of fibre optics that spans great distance.

Hub, or concentrator are simply multiport repeaters. The concept is the same. A signal delivered to any port on the hub is regenerated and forwarded to all ports. Again, there is no examination of frame. Hub is generally deployed in small LAN. When the number of nodes increases, perform degrades significantly as bandwidth are contended.

Ethernet is a baseband medium, meaning only 1 signal will be transmitted at any one time. When 2 nodes attempt to transmit at the same time, collision results. The nodes will reattempt to transmit at a later time. Therefore, hubs and repeaters in a network form a collison domain.

As the size of network grow, the collision domain becomes less efficent. Solution needs to partition the collision domain.

Monday, August 8, 2011

Memory Access

First level cache often has access time between 1 to 3 processor cycles. Access to second level cache requires 20 to 30 cycles and access to memory will needs >100 cycles.

In a multi-core system, each core has memory directly attached to it. Access to the local memory is faster than access to the memory connected to another core.

One approach is to interleave memory, commonly at cache line boundary so that application would access local and remote memory alternatively. On average, application will experience memory latency which is average of the local and remote memory access. This architecture is called uniform memory architecture (UMA) where all processor see uniform memory latency

Another approach is to accept the difference memory region will have different access frequency and make the OS aware of this. This architecture is called cache coherent nonuniform memory architecture (ccNUMA). OS will attempted to assign process to run on the same processor to maximize local memory access.

Cache

Direct mapped cache - each cache line in memory maps directly to a fixed cache line position. If the cache is 4KB in size, every memory location in 4KB apart maps to the same cache line.

N-way associative cache - each cache line in memory maps to a set of N possible location in cache. The location chosen will be determined by some replacement policies (e.g. random or LRU etc).

Fully associative cache - each cache line in memory maps to any position in cache. This is rarely implemented because of its complexity

First level cache access time is 1 to 3 cycles. Second level cache access time is 20 to 30 cycles. Memory access takes more than 100 cycles. The reason for having multiple level of cache because the larger the cach, the longer it takes to find out if an item is stored there.

Saturday, August 6, 2011

Initial RAM Disk in Linux

In the final boot step, the Linux kernel will execute a series of run_init_process calls:

run_init_process("/sbin/init");
run_init_process("/etc/init");
run_init_process("/bin/init");
run_init_process("/bin/sh");
panic("no init found. try passing init= option tp kernel.")

The run_init call execite program in a filesystem already mounted, else the kernel will panic. Linux kernel contains 2 mechanisms to mount an early root file system to perform certain start up functions.
(1) initial RAM disk (initrd) is the legacy method. Support for this must be compiled into the kernel. It is a small, self-contained root file system that usually contains directives to load specific device driveers before the end of boot cycle. For example, Red Hat and Ubbuntu uses initrd to load the EXT3 file system driver before mounting the real root file system.
The bootloader pass the initrd image to the kernel. A common scenario is that the bootloader loads the kernel image into memory and loads the initrd image to another address ranges. Bootloader than passes the load address of initrd to kernel. In other scenario (ARM architecture), the initrd and kernel image are simply concatenated into a single composite image. The address and size of the intird is passed to kernel using the kernel command line. One note is that the initrd image is universally compressed.
When Linux boots, it detects the presence of initrd image. It copies the compressed binary file from the physical location in RAM to proper kernel ramdisk and mounts it as the root file system. It then looks for a special file in the initial ramdisk called linuxrc. This file contains directives (commands)required before mounting the real root file system. After the kernel copies the ramdisk from physcial address into the kernel ramdisk, it releases the physical memory. This is similar to transfer the initrd image from real memory to virtual memory of the kernel.

As part of the Linux boot process, the kernel must locate andmount a root file system. The kernel decides what and where to mount in a funciton called prepare_namespace() in /init/do_mount.c.

If initrd support is enabled in the kernel and the kernel command line is so configured, kernel decompresses the initrd image and copy the content to ramdisk under /dev/ram. At this point, we have a proper file system on a kernel ramdisk. The kernel effectively mounted the ramdisk device as its root file system. Finally, kernel spawn a kernel thread ti execute the linuxrc file. When linuxrc is done, kernel unmounts the initrd and proceed to the final stage of system boot. If the real, root device has a directory called /initrd, Linux will mount initrd file system on this path. Otherwise, the initrd will be discarded.

If the kernel command line contains a root= paramter specifying a ramdisk (e.g. root=/dev/ram0), the previous described behaviour will change. First, the processing of the linuexrc file is skipped. Second, no attempt is made to mount another file system as root. It means the initrd is the only root file system. This is useful if minimal system configuration is intended.

(2) initiramfs is the preferred mechanism for executing early user space programs. It is coneptually simliar to initrd and its purpose is also similar: to load drivers that might be required to mount the real file system.
Initramfs is much easier to use by developer as it is a cpio archive whereas initrd is a gzipped file system image.

init process in Linux

The File System Hierarchy Standard (FHS) establishes a minimum baseline of compatibility among Linux distributions and application programs.

AT the end of kernel initialization, it calls run_init_process() which refer to programs expected to residing in the root file system. If the programs cannot be found, the system will halt at the panic().

It is not sufficient to simply include an executable call init in the system and expect it to boot. You must also satisfy its dependency (unresolved reference and configuration data in external file).

The init process supplied with Linux is very flexible. It implements what is commonly called System V init, from UNIX System V using this schema.

Init spawn other processes under the direction of configuration stored in /etc/inittab. Init execute differently according to runlevel (0-6, S). Runlevel 0 halt the system. Runlevel 6 reboot the system. Associate with each runlevel is a set of startup and shutdown scripts (kept in /etc/rc.d/init.d

When init starts, it reads /etc/inittab. This file contains directives for each runlevel. The sysinit tag indicates the script to be run first. The initdefault tag indicate the default run level. Init then executes the scripts denoted by the runlevel tag l#.

Linux Boot Up

The bootloader takes control once power is applied to the computer. Bootloader is a set of routines designed to do low-level initialization, OS image loading and system diagnosis (dump, test etc). It loads and passes control to the OS.

Using XScale platform as an example, the Bootloader passes control to head.o module label Start, which is a bootstrap loder. The bootstrap loader appended to the kernel image has a primary purpose: to create an environment to decompress and relocate the kernel and pass control to it.

Control is passed to the kernel proper, to a module called head.o (a different module than before) and label Start. In other words, head.o is the kernel entry point. The head.o module performs architecture and often CPU-specific initialization in preparation for the main body of kernel. Still, CPU-specific initialization tasks are kept as generic as possible across the CPU family. Machine-specific tasks are performed elsewhere.

The head.o module checks for valid processor and architecture, creates initial page table entries, enables processor's MMU (Memory Management Unit), establishes limited error detection and reporting.

When control is first passed to head.o, the processor is in real mode (the program counter represents physical address). After the processor's registers and kernel data structures are initialized to enable address translation, the processor's MMU is turned on. Suddenly the address space as seen by the processor is yanked from beneath it and replaced by an arbitrary virtual addressing scheme. This is why debugger cannot single step through this portion of code.

The final action of head.o is to jump to the kernel proper's own main.c, a startup file written in c.

The first line of output from the kernel is the kernel version string. Upon entering start_kernel call (in main.c), printk() displays Linux_banner which contains information such as version, username and machine which the kernel is compiled, build and version number etc.

start_kernel is by far the biggest function in main.c. Most of the kernel initialization takes place in this routine.

One of the first few things start_kernel called is setup_arch(&kernel_command_line). This module calls setup_processor() which verifies the CPU ID and revision, calls CPU-specific initialization functions, and display CPU information in the console. Finally, setup_arch calls the machine-dependent initialization routine.

Next to architecture set up, main.c performs generic early kernel initialization and then displays the kernel command line which is a list of parameters.

After start_kernel() and calling some early initialization functions explicitly by name (e.g. init_timers, console_init), the first kernel thread, called init (PID=1), is spawned. At this point of boot sequence, 2 distinct threads are running: that represented by start_kernel(), and now init(). The former becomes the idle process after completing its work. The latter becomes the init process.

start_kernel() calls rest_init() and this allows start_kernel() to return most of its memory to the address space. The kernel's init process is then spawned in rest_init() by the call to kernel_thread(). init continues with the rest of the initialization while start_kernel() executes the function cpu_idle() that loops forever.

After further initialization, the final step of boot up is to try to run a series of run_init_process call.

run_init_process("/sbin/init");
run_init_process("/etc/init");
run_init_process("/bin/init");
run_init_process("/bin/sh");

One way of another, these run_init_rpocess() commands must proceed without error, or the system will fall down to a panic() call to halt the system.

The run_init_process() function does not return on successful invocation because it call execve() system call which replaces the process executable with a new one.

Friday, August 5, 2011

Linux File System

Linux file system is stored in partition, which is a logical disk. There are 3 type of partitions in Linux - Linux, FAT32 and Linux Swap.

ext2 (Second Extended File System) is a conventional FS using inode etc. ext3 built on ext2 is a journal file system. Change to the file system (e.g. date, size, blocks used etc) are stored in a special file so recovery is possible from known journal point. It eliminate the weakness of long check time for ext2 after abnormal shutdown. ext4 is also a journal file system with larger fs and file size support.

ReiserFS is also a journal fs like ext3. Reiser4 introduced a API for system programmer to guarantee the atomicity of file system transaction. A jfs guarantee the metadata changes have been stored in the journal file so that the kernal can at least establish a consistent state of the file system. That is, if file A was reported to have 16KB before the system failure, it will be reported as having 16KB afterward in the inode. This does not mean that the file data was properly written to the file.

Linux Device Node

A device node is a special file type represents a device. It is kept under /dev. mknod create a device node. For example,

mknod /dev/xyz1 c 234 0

This create a file xyz1 under /dev. It is a character device (crw-r--r--) with a mojor number of 234 and minor number of 0.

By itself, the device node is just another file in /dev. But we can use it to bind to an installed device driver. When an open() call is excuted with this device node as the path parameter, kernel search for device driver that registered with a major number of 234. This is how kernel associate the driver with our device node.

Minor number is a mechanism for handling multiple devices or subdevices with a single device driver. It is not used by the OS but simply passed over to the driver. What to do with the minor number is entirely up to the driver. For example, the minor number can denote one of the ports on the card managed by the driver.

Device node is usually created by udev rather than manually using the command.

Record