Wednesday, December 31, 2008

Single Instruction Multiple Data SIMD

The first SIMD in Intel CPU appeared as MMX (Multimedia Extension). MMX works on 64-bit register. The 64-bits are recognized as 2 32-bit values. Instead of creating new SIMD registers, Intel has decided to use the floating point registers for MMX calculation. The floating point registers R0 to R7 are renamed to MMX0 to MMX7 for SIMD. The new data type created for MMX are all integer-based.

AMD follow with 3DNow! technology which used the floating point registers for SIMD calculation as Intel's MMX. The difference is that AMD recognized the data packed in the MM0 to MM7 register as floats. It was followed by Enhanced 3DNow! in Athlon CPU in which both Integer and Floating Point data types are supported.

In its, Pentium III CPU, Intel provided a set of 8 new registers of 128-bits size. This is enough to store 4x32-bit floating point values and this is useful in 3D calculation which deal frequently with 4x4 matrices. These new registers are called XMM0 to XMM7. Intel called the technology as Streaming SIMD Extension (SSE). Intel upgrade to SSE2 when introducing Pentium IV. With SSE2, integer values are supported as well. AMD followed suit with 3DNow! Professional, which is fully compatible to SSE2.

Saturday, December 13, 2008

LUN Masking

LUN masking performs a similar function as zoning. A LUN is a logical representation of a physical unit of storage including physical disk, physical tape drive, robotic control device, logical disk (formed by multiple physical disks with striping or RAID). LUN masking hides LUNs so that each server sees only the LUN accessible to itself. LUN masking is performed at a level just above zoning. A zone grants or restricts access only to a given port on a storage arrange, but LUN masking can take that port and grant one server access to some of its LUNs. LUN masking can be performed on 3 levels - storage array, intelligent bridges and routers, and HBA driver.

SAN BAU

Firstly, create virtual disk and tape devices for sharing. Then create zones or use LUN masking to create dats paths between resources and servers that access them. Thirdly, use persistent binding to create a permanent relationship between each virtual resource and a SCSI ID on the server. Finally, use multipathing software to manage the multiple paths between the set up.

Persistent Binding

The way that some OS assign SCSI addresses to SAN devices may cause their SCSI address tochange after a reboot, especially if storage resources configuration has changed. The solution to this problem is persistent binding between a SCSI ID and a WWN. This relationship is part of the HBA driver configuration.

Zoning

Zone is equivalent to VLAN. Zoning limits the number of hosts that can see a given storage resources. A hard zoning is created using a list of physical ports. Soft zoning is created using a list of WWN.

Multiple Paths to a Single Device

Each SCSI device willappear only on one SCSI bus. IN FC, it is not true and because of multiple HBA in the same host, it could have more than 1 path to the same device. SAN switch is closer to Ethernet router than switches. Ethernet switch uses spanning tree to detect loop and disable redundant path. Ethernet router uses a routing protocol such as OSPF to ahndle multiple path. In FC, FSPS (FC Shortest Path First) to load balanceSAN traffic and failover.

Because each device on the SAN will appear as a SCSI ID on each HBA connected tot he same, a system with multiple HBA connected to the SAN will actually think that each device on each path is a separte device. A logical layer needs to be inserted to mask these paths and present the OS with the appearance of a single SCSI connection. This is achieved by multipathing software runs on each server.

Friday, December 12, 2008

Fiber Types

Multimode fiber can carry multiple light rays (know as modes) at the same time by transmitting each mode at a slightly different reflection angle with the fiber core. Because of dispersion over longer lengths, multimode fiber is used for shorter distances (< 500m).

Multimode fiber has 2 diameter choices. The larger (62.5-micron) core can be used to a length of 175m. The newer 50-micron core is able to extend to 500m Two diameters are used to let one to reuse older fiber optic cable that ran for say FDDI network.

Single-mode fiber has a much thinner core for transmission of 1 mode up to 10 km.

Optical Fibrer Control (OFC)

There are 2 types of lasers used in HBA. OFC devices use a hand-shaking method to ensure they do not transmit a laser pulse if there is nothing connected to the HBA for safety purpose. A non-OFC will transmit even if a device is not connected to it.

FCAL Address Selection

When a loop initialize, or when a node powers up or when a node joins the loop, it sends a loop initialization primitive (LIP) frame. This causes every other node in the loop to send a LIP frame. The loop becomes unusable.

Each node then continues to transmit the loop initialization select master (LISM) frame to select a master. If the loop is also connected to a fabric (via FL port), the fabric port becomes the loop master. If not, the port with the lowest port name is chosen as loop master.

The next step is every node must select an AL_PA. The loop master sends a loop initialization select AL_PA (LISA) frame around the loop. Each L port on the loop selects a free AL_PA in the frame and marks that AL_PA used. It then forwards the frame to the next L port in the loop. In case of reinitialization, the node will attempt to select the previous AL_PA whenever possible.

Once the LISA returns to the master, it sends the CLS (close) grame to notify all nodes that the initialization process has completed.

FCAL Arbitration Process

A node wishes to transmit by initiating a ARB (arbitrate) frame. If no other ports are communicating, the ARB grame is received by the port that transmitted it, the node wins arbitration and begins to send its data.

When more than 1 nodes are arbitrating sumultaneously, the ARB frame with lowest AL_PA wins the arbitration.

Nodes that won arbitratuion can transmit until it finishes. It is not the same as token ring which transmission is limited to a fixed time period. To allow nodes with lower priority to transmit, a port that wins arbitration will set its access variable to 0 which will inhibit arbitration request. This is called fairness algorithm.

Once all talkers are allowed to transmit, the last one will place an IDLE frame to the loop. This casued every nodes to set its access variable back to 1 and the process start over again.

Difference between FCAL and Fabric

First difference is that FC-AL is a shared medium. Nodes that wish to transmit must arbitrate for the right to do so. The second difference is that the node on a loop select their own address rather than having it assigned by a switch as in fabric mode.

Arbitrated Loop

FC-AL was actually an add-on to the original FC specification which consists of Point-to-point and Fabric topologies. This is introduced as the cost for fabric is quite high and point-to-point limits the number of device connecting. FC-AL can be constructed by using NL Ports (host or storage devices) connected by fiber optic cables with SC connectors that can be split. It is an inexpensive way to connect several disk to a host. One disadvantage of this configuration is it is only practical with short distances due to logistical problem. Another problem is one bad HBA or cable can take out the entire network.

A better configuration is star layout. The connectors do not need to be split. A hub will make sure the transmit and receive ends are matched up. A manged hub can also prune the node in the event of a failure.

Fabric Topology in FC

Each N port plugs into one F port on the switch. Each node is assigned its S_ID by the switch when it logs on to the fibric. 24-bit address allows approximately 16 million unique addresses within a single fabric.

Point-to-point FC

It is the simplest and least expensive topology. It is Simplify 2 N ports communicate via a point to point connection. Although a N port connect to a F port is also point to point, it is considered as a part of a larger network.

FC Addressing

World Wide Name WWN is a fixed, unique, 64-bit addresses assigned to each port by each manufacturer. There are 2 dynamic addresses that may be assigned when this port connects to a FC network. If it is connects to an arbitrated loop, it is assigned a dynamic 8-bit address, refered to as its arbitrated loop physical address (AL_PA). If it connects to a fabric, it will be assigned a dynamic 24-bit address, refered to as its native address identifier (S_ID). When a port is connected to both arbitrated loop and a fabric, it si assigned a 24-bit adddress, with the lower 8-bits as the AL_PA.

Other FC Port Types

L Port implies it can participate in an arbitrated loop. Exclusive L Ports do not exist.

NL Port is a node port with arbitrated loop capabilities. A NL Port can connect to another node, to a switch or to an arbitrated loop.

FL Ports is a fabric port with arbitrated loop capabilities. The switch port can connect to either a node or to an arbitrated loop.

G Port is a generic port on a switch which can act as E Port, F port or a FL port depending on what connects to it.

Fibre Channel Basic Ports

N Port - node port which corresponds to a port on a disk or computer. N port can communicate with another N port or a F port on a switch.

F Port - fabric port which is found only on a switch. F port can only communicate with a N port.

E port is an expansion port on a switch that connects to another switch.

Media for FC

FC can run over both fiber and copper. Copper cables are less expensive and suitable for shorter networks. Most common copper cable is twisted pair with DB-9 connector. Cable lengths can run up to 30m. Optic cables can run up to 175m ,500m and 10km or longer.

Five Layers of FC

FC-4 (Mapping) defines how a FC network communicates with upper level protocols (SCSI and IP). Each upper-level protocol (ULP) that is transportable over FC has a map for it in FC-4.

FC-3 (Common Services) is used by applications requiring more than one port, such as striping.

FC-3 (Framing) is similar to MAC layer in OSI. It defines how data from upper level applications is slit into frames for transport over the lower layers.

FC-1 and FC-0 are similar to physical layer in OSI. FC-1 (Ordered Set) defines how frames are encoded and decoded for transport across those media types. FC-0 (Physical) defines the various media types that can carry FC data.

Fiber or Fibre?

Originally, Fibre Channel was intended to run on fiber optic cables. Confusion arose when copper wire was supported. The standard committee decided to use the word fibre, which is the European spelling for fiber to name Fibre Channel.

Sunday, November 9, 2008

Tomcat

Tomcat comprises 2 components:
- Catalina is the servlet container
- Jasper parses and translates JSP to Java servlet and then compile into Java class to be managed in Catalina

Deployment descriptor tells the container details about the servlet. It also provide function such as hiding the internal directory structure by specifying an alternate mapping. The mapping also enable the change of the internal directory without affecting the external access path.

HTTP Post

It is an erroneous view by many that POST is a more sophisticated form of GET. HTTP standard views POST as a way to request creating an entity in the server. When POST is used, the server can respond in 2 ways:

HTTP 200 or 204 = respond with an acknowledgement and provide no other data
HTTP 201 = indicate the entity has been created and provide more information about the creation

The latter case makes POST appear like a GET.

Static or dynamic GET responses can be cached with uses of HTTP control header such as If-Modified. HTTP 1.1 uses either date or tag (Etag) to validate content from cache. POST is considered as a mutable operation on the server. HTTP methods PUT, DELETE and POST must cause a cache to invalidate its entry. Thus POST is less efficient comparing to GET.

J2EE

The J2EE solution for serving application logic is Enterprise JavaBeans (EJBs). EJBs can be contacted directly by servlets, by applet containers or by JMS. In contrast, servlets are primarily meant for HTML-based, thin client session management and for delivering request and response to web user. By offloading the session and interaction to servlet, EJB can focuses on business logic.

EJB has built-in support for many low level technologies to enhance the scalability of server side logic. Comparing to client side logic (fat client), scalability is a challenge for server-side logic. The technologies include object persistence, transaction management and location transparency.

Under the J2EE model, EJBs are distributed objects managed by containers. The container provides surrogates (EJBObject) that interact with individual bean instances, on behalf of the client. The containers manage the lifecycle of its bean instances (creation and destruction). A client communicates with an EJBObject. The EJBObject acts as a middleman in the communication between client and bean. Its assignment to a bean instance is coordinated by the container. Client interacts with EJBs consists of the following steps:

1.A handle to an EJBObject is acquired by client.
2.Business methods of that object are called by client as needed.
3.After use, the client relinquishes the handle to the EJBObject.

The home interface is used for step 1 and 3. A local or remote interface for step 2.

The home interface provides factory-like services to create, destruct and find the EJB requested. The remote or local interfaces provide a clean API to application logics (business methods) encapsulated by the bean. The local interface is meant to be accessed by clients located on the same host as the bean.

Bean provides call-back method for the container to manage its lifecycle:

Creation: When the client demands but no available instance exists, the container must instantiate a new instance.

Destruction: Bean instances can be periodically garbage-collected.

Activation: A lite form of creation. Bean instances are pooled for performance reason. Container draw from the pool to fulfil a request.

Passivation: A lite form of destruction. A bean instance is returned to the pool.

There are 3 basic types of EJB:

Session beans are associated with specific business transaction, particularly one requested during an interactive session.

Message-driven beans are also associated with specific business action, particularly one that is necessary for application integration or batch processing.

Entity beans are associated with an application object that requires persistent storage.













Session beans are either stateful or stateless. Stateful session bean maintain state during communication with a client. They retain the values of their instance variables between client requests. It is important for the client to interact with the same bean. Theoretically, there should be as many stateful session beans as there are concurrent sessions. According to J2EE spec, stateful session beans may be periodically written to persistent storage.

Stateless session beans do not maintain state between requests and therefore can be used to process requests from any client. They are more scalable. However, session bean often requires state management. The state must be stored somewhere: cookies, URL rewrite, in server-side memory or in database.

Entity beans correspond to application objects that are meant to be persistent – contain potentially valuable information across sessions. It may be useful to think of entity bean as tables or relations in a relational database. They contains attributes, foreigh and primary keys to enforce entity integrity. Entity beans can be shared by multiple clients. Entity bean can relate to each other.

Entity bean's persustence can be either container managed or bean managed. BMP is there because J2EE implementation cannot know everything. The main advantages to CMP are simplicity and portability. You do not need to code SQL. You just need to provide a few methods that meet the requirements of your bean contract and specify some key information in the deployment descriptor. J2EE does the rest. As a result, your beans consist of much less Java code.

CMP method and mechanism for persistence vary between vendors. This means that CMP could be implemented by writing serializable Java object to the filesystem or by tight integration with a high performance database.
Since entity beans have relationships with other entity beans, there is a considerable likelihood that they'll be chatty in nature. Just as navigating a relational data model can result in many queries to the database, navigating through entity EJB objects can result in substantial cross-object communication – marshalling and network communication.

EJB 2.0 addresses this by offering a local model. For entity bean that are only called by other entity bean but not directly from client, maintaining a remote interface is a waste. EJB 2.0 develops the bean with a local interface and making it a local object.

The advantages are (1) more efficient argument marshalling – pass by reference rather than by value, (2) more flexibility in terms of security enforcement.

Another CMP feature in EJB 2.0 is container-managed relationships, equivalent to RI in database. Finally, CMP includes dependent objects, which can be thought of as extensions to entity bean. Dependent objects allow complex CMP fields to be represented in a separate class. This is a way of breaking up a more complex object into distinct parts.

BMP results in more time-consuming and complex development responsibilities. It also implies maintenance overhead for code and changes in data model. For performance, BMP may be more desirable but must be done right.

Session bean is unique on the method of invocation. Clients do not interact by calling a remote Java method. Instead, clients send messages via JMS and result in execution of an onMessage() bean function.

Interactive web page experience

The advantage of browser based client is the ease of deployment. However user is always held back by the synchronous nature of the request-response underpinnings of the Internet – the latency of complete page refresh.

Microsoft has introduced the concept of remote scripting (MSRS) to overcome this limitation. It allow developer to interact with the server asynchronously. For example, user can select the drop down list and causing a script to run at the server to download the value for the drop down list, without a complete page refresh. MSRS works with Microsoft technology only and requires Java.

Brent Ashley developed JSRS (JavaScript Remote Scripting) using client-side JavaScript library and DHTML to make asynchronous call to server. Other people uses IFRAME to reload only part of the page or make hidden call to the server.

AJAX is another solution. It is not new. It is the “newest” technology related to XMLHttpRequest object (XHR) which has been around since IE5 (1999) as an ActiveX control. XHR since then has also implemented in other browser such as Mozilla and Safari. It is even covered in a W3C standard: DOM Level 3 Load and Save Specification. AJAX is a client-side approach and can interact with J2EE, .Net, PHP, Ruby and CGI scripts. In other words, it is server-agnostic.

Load and Save is the culmination of an effort that began in 1997 as a way to solve the incompatibilities in the browsers. DOM Level 1 was finished in 1998 giving HTML 4.0 and XML 1.0. DOM Level 2 completed in 200 giving CSS. Load and Save gives Web developer a common, platform-independent API to access and modify the DOM.

DOM is based on a concept from OMG. DOM defines the data and structure on a page. DOM give you a standard way to interact with your documents. By modifying the structure of a web page, you dynamically change the display, resulted in giving a rich client interactive environment to the user. For example, when the user client on the search button, your page makes an aynchronous call via XHR to do the search. After the server returns the search result, you use DOM call to modify the web page to display the search result (for example, in the form of a table). The broswer adjust the display and thus give a interactive sensation to the user.

Saturday, November 1, 2008

What is Marshalling

In RPC, an IDL is similar to a header file in C. IDL generates client stub and server skeleton, a piece of C code compiled and linked edit into the host programs. The stub converts paramters into a string of bits and sends message over the network. The skeleton does the reverse. The process of converting parameters to message is called marshalling. The advantage of marshalling is that it handles differing data format between the client and the server (e.g. 32-bit on client and 64-bit on server). Serialization is to take an object and convert it to a message to be stored on disk or sent over the network.

Saturday, October 25, 2008

What is asynchronous logic?

Asynchronous logic (circuit) means the output value depends on the input logic and the combinatorial logic that links them. There is no delay in the process except the propagation delay through the logic gates.

However, to build a state machine (e.g. computer), the change of state must be synchronized with some master signal (clock).

Color

Hue is the name of the color. While there are many color, there is far less hue. Pink and crimson are color but the base hue of these color is red.

Saturation (Chroma) is the strength or purity of the color. It is simply the amount of white in the color. Brightness (Value) is the lightness or darkness of the color. It is determined by the amount of black in the color.

Combination of hue, saturation and brightness create infinite number of color that we see in the world.

Sunday, October 5, 2008

What is a pipe?

A pipe is a communication buffer defined as 2 file descriptor. One descriptor allows write to the pipe and the other allows one to read from the pipe. A pipe has no external name and so the only way to access it is via the file descriptor. Pipe is also temporary such that they will be deallocated when no process has them open.

A pipe is typically used by a parent process to communicate with its child. It is achieve by creating the pipe before the fork call. The file dscriptors of the pipe will be then be inherited by the child process. Read/write to the pipe is not atomic. In other words, the read could return partial data if the read and write call is in a race condition.

The pipe "|" used in shell (e.g. ls -al | sort -n 4) can be implemented using a combination of pipe and I/O redirection. The pipe-read file descriptor replaces the stdin for one process, and the pipe-write file descriptor replaces the stdout for the second process.

Saturday, October 4, 2008

How is I/O redirection realized?

The command "cat > test.txt" will echo terminal input into a text file instead of stdout.

I/O redirection is implemented using the dup2 function.  dup2 copies one file descriptor to another in the file descriptor table.  To realize redirection in the cat command above, first open test.txt which result in the creation of a new file descriptor in the table.  Then use dup2 to copy this new entry to file descriptor 2 (stdout).  After the call, stdout is now pointing to test.txt.  When cat writes to stdout, the data will be written to test.txt.

What is a file pointer?

The ISO C standard I/O functions use file pointer instead of file descriptor.

File pointer is the address of a FILE structure which in turn contains the file descriptor number. The FILE structure also contains the buffer used in buffered I/O.  Data will be written out from the buffer to the destination devices when the buffer is filled up (for disk I/O) or when the new line character is encountered (for terminal I/O).  The I/O subsystem performs the write using file descriptor.

The fflush call force-writes the buffer out immediately.

What is a file descriptor?

A file descriptor is an integer corresponding to the index into the file descriptor table.  The file descriptor table exists in user address space.  Each process has its own copy of the file descriptor table. Direct access to the table is not possible except via functions that uses file descriptor.

File descriptor table entry points to am entry in the system file table.  The system file table exists in the kernal space and the same table is shared by all processes.  Each entry contain the current offset, access mode and number of file descriptor pointing to it.  Each system table entry points to an entry in the in-memory inode table which represents the physical file.  Similarly, more than one entry of the system table may point to the same in-memory inode table entry.

Friday, October 3, 2008

What is the benefit of representing device as file?

Devices are represented as special files in UNIX under /dev.  Representing devices as files simplifies the programming model such that all devices can be controlled consistently by 5 system calls - open, close, read, write and ioctl.

Block devices are those that have characteristics similar to disk.  They typically allow random access.  Data are access in unit of blocks.

On the other hands, charcter devices have characteristics similar to terminal.  Access is sequential and data is represented as a stream of bytes.

Difference between stdout and stderr

stdout and stderr is defaulted to the console.  It means both error messages and normal output will be shown on the same device.  Why do we differentiate stdout and stderr in the first place?

Output to stdout is buffered.  In other words, output may take some time before it will appear in the destination device (console typically).  In contrast, stderr is unbuffered.  It means output will appear in the destination device immediately after the write call.  That's why error message should be written to stderr instead of stdout.

Why are there zombie process?

The fork call will create a child process which is independent from its parent. UNIX allows the parent process to synchronize its process with the child process via the wait call. As the system cannot predict if the parent process will execute wait call for the child or not, UNIX will not release all the resources when a process terminates in case its parent executes wait later on.  The termination status (normal or abnormal) of the child process and its resource statistics will be kept for the wait call.

Terminated process which has not been waited for becomes a zombie process. If the parent terminates without executing the wait call for its child, the child zombie process becomes an orphan process. Orphan process will be adopted by the init process (PID=1). init will call wait periodically and this is the mechanism UNIX uses to clean up zombie processes.