Sunday, October 24, 2010

Remote Database Access

There are 2 programmatic interfaces. One approach is to use Dynamic SQL, of which the SQL text is passed from client to server. The other approach is to disguise the remote database access underneath the normal database interface. The database schema indicates that certain tables reside on a remote machine. The database is used by the programmer in a normal way, just as if they are local (except for performance and additional error messages to handle).

Remote database access generated large network overhead because of the sheer amount of data that passed back from the server to the client. This is not good for transaction processing. Most database supports stored procedure. This turns remote database access into a form of RPC with 2 differences:

(1) it is run-time and not a compile-time interface with no IDL equivalent
(2) the procedure is typically written in proprietary language or commonly in Java.

Stored procedure approach is faster than remote database access in general. On the other hand, for ad hoc queries, remote database access is more suitable. The reason was the returned data is too dynamic (number, type etc) for RPC-like call to handle efficiently.

There are many remote database access technology. Microsoft has ODBC (Open Database Connectivity), OLD DB (Object Linking and Embedding Database), ADO (Active Data Objects) amd ADO.NET. In Java environment, there is JDBC and JDO (Java Data Objects). Oracle has Oracle Generic Connectivity and Oracle Transparent Gateway. IBM has DRDA. Each vendor wanted the customer to use their technology and integration engine. Most database however supports ODBC and JDBC.

Remote Procedure Call

Procedure calls are a major feature of most programming language. Therefore, it is logical to extend this to access remote service. The idea is that both client and server program can remain the same, just as if they were on the same machine.

The best-known RPC mechanisms are Open Network Computing (ONC) from SUN Microsystems (now Oracle) and Distributed Computing Environment (DCE) from the Open Software Foundation (OSF), a group formed in late 1980s by IBM, HP and DEC. OSF was to be an alternative to AT&T who owned the UNIX brand name and had forma group, which include UNISYS, called UNIX International.

For C program, one include a header file in the program that contains the moduile's callable procedure declarations (procedure name and parameters) minus the logic. For RPC, instead of a header file, it uses an Interface Definition Language (IDL) file. IDL file is syntactically similar to a header file but it is also used to generate client stubs and server skeletons, which are small chunk fo C code that is compiled and linked to the client and server programs.

The purpose of the stub is to convert parameters into a string of buts and send the message over network. The skeleton does the reverse and call the server. The action of converting parameters to message is called marshalling. The advantage of marshalling is that it handles differing data formats in varions platforms. The newer term for marshalling is called serialization.

The problem with RPCs is multithreading. A client program is blocked when it is calling a remote procedure. just as it would be calling a local procedure. The wait is subject to loss of message in transport, netowrk congestion, server response problem and other unpredictable conditions which render the client to wait forever. If a program read input from user while requesting for data at the backend via RPC, two threads are required.

For server, RPC requires a separate server thread for every client connetion. If threads need to share resource, the programmer must use locks, semaphores or events to synchronize accesses.

Multithreaded programs are hard to write becuase it is extremely difficult to test due to race conditions.

Saturday, September 18, 2010

NetBIOS Scope

Scope is defined as the set of NetBIOS nodes that participate in a virtual LAN. Each scope as a name called Scope Identifier (Scope ID). The most common Scope ID is an empty string ("") which is default in Windows and Samba. Scope ID are used to identify different set of nodes on the same IP LAN. In a broadcast or multicast operation, nodes with different scope ID will ignore the packet. WINS server handles requests form nay node regardless of SCOPE ID, Thus a single WINS server can support multiple scopes.

A node may belong to more than one scope. NetBIOS has no concept of scope. It is a feature of NBT (NetBIOS over TCPIP). Program calling the NetBIOS API has no way to tell NBT which scope to use. Extending NetBIOS to include scope will break existing program and creating incompatibility problem with other transport (e.g. Netware).

Scope ID are used by Name Service and Datagram Service, but not Session Service. Scope ID is used to idefntify virtual NetBIOS LAN and operates at a lower level than the NetBIOS API.

NetBIOS Session Service

This is quite straightforward when all NBT nodes are of same type. Strange thing will happen when mixing node type, especially on a routed environment.

P & B - The B nodes will only see other B nodes and P nodes will only see other P nodes using the same WINS server. In other words, separate VLANs are created.

P & M - M nodes peform all the function of a P node, including registering their names with WINS. All P nodes can see all M nodes, though M nodes on the same wire can bypass WINS when resolving names.

B & M - On a signle, non-routed environment, all M nodes behave like B nodes and perform registration and name resolution via broadcast mechanism, making use of WINS pointles. In a routed environment, B nides on one subnet will not able to see any node on other subnet, M nodes will see all other M nodes but only B nodes on their local subnet. A potential problem is name collision, of which a B node may register a name that is already exist in the WINS databsae, and an M node jught register a name that one or more nodes on remote subnets.

P, B & M - The P nodes can see all of the M nodes which can see some of th eB nodes which cannot see any P nodes at all. The situation becomes worse.

NetBIOS Datagram Service

The Datagram Distribution Service is the NBT service that handles NetBIOS datagram transport. It runs on UDP 138 and handles unicast, multicast and broadcast NetBIOS datagrams.

For unicast, the IP address is obtained from Name Services and the NetBIOS packet is encapsulated in a UDP packet for the target IP.

for multicast and broadcast delivery, a B node encapsulate NetBIOS packet and send over the IP broadcast address. The packet would picked up by all nodes on the IP segment (138/UDP). In case of multicast, nodes which are not member of the group will discard the packet. For P, M or H nodes, the WINS database must be consulted to determine the distribution.

NetBIOS Name Service

The NetBIOS LAN architecture is a simple and non-routable network. Just a bunch of nodes connect to a virtual segement. There is no hardware address, network address or port number. Instead, each node is identified by a 16-byte string known as NetBIOS name.

Application can add and remove NetBIOS name dynamically. Each node will also have a default name (called Machine name or workstation service name), which is added when NetBIOS starts. The process of adding name is called registration.

Two kinds of names can be registered - unique and group. Group name can be shared by multiple nodes and used for multicast operation. Name service keeps track of all NetBIOS names used in the virtual LAN and directed to the correct underlying OP address.

If all nodes on the same IP segment, each node keeps a list of names it has registered (i.e. owned). When sending a message, the first step is to send an IP broadcast query to located the target node, which will response with the IP address. This is known as B (Broadcast) mode name resolution. The participants are referred to as B nodes. In B mode, each node keeps track of its own name and so the name service database is a distributed database.

If nodes presents on different IP segments, a machine is chosen to be the NetBIOS Name Server (NBNS) or WINS (Windows Internet Name Service) Server. In order to use WINS, all nodes participating in the virtual NetBIOS LAN must be given the WINS server IP address. NBT client nodes send name registration and queries directly to WINS. This is known as P mode (Point to Point) name resolution and the participants are called P nodes.

M mode (Mixed) combines the characteristics of P an dB modes. H mode (Hybrid) which was introduced later was similar to M mode except the order in which the B andP mode behaviour is applied.

Name service is implemented on UDP port 137. TCP 137 is not defined in NetBIOS.

NetBIOS

SNA (System Network Architecture) was too complex for PC and so IBM hired a company call Sytec and together created a product called PC Network. PC Network was a LAN system designed to support abotu 80 nodes at best and no provision for routing. Microsoft used the NetBIOS API to transport SMB file service messages. A redirect program was created simply to look up disk drive (e.g. S:) or port reference (e.g. LPT3:) from a table. If the device was not found in the table, it was passed down to DOS. Otherwise, the call would be redirected.

Using SUBST, an alias can be created for a long path name (e.g. SUBST S: C:\Program). Using NET command, a drive letter could be used to map to a remote file service (e.g. NET use N: \\SERVER\fileshare1)

NetBIOS is a session layer (layer 5) API. The API made a number of assumption about the underlying network. Three basic services must be implemented - Name Service, Datagram Service and Session Sevice.

Saturday, September 4, 2010

iSCSI

FCP recover frame loss by retransmitting the frame. iSCSI relies on TCP to recover lost data packet. Because iSCSI on IP, there is no distance limit. iSCSI fabric has not intelligence like FCP. iSNS combines features from FC SNS with DNS. Devices registered iSNS on a well-known IP address. iSNS could be implemented by an attached devices or embedded in the iSCSI switch. Once the initiator and target established the session, the iSNS is no longer required in the communication.

Friday, September 3, 2010

Fibre Channel Fabric

In a conventional network, hosts are assumed to be intelligent and network is to facilitate the communication betweeh nodes and ensure data delivery. In SAN, am emd devoce is relative dumb such as a JBOD. Targets are passive, listening for requests and do not advertise itself. The SAN must provide a discovery service, which is a function of the simple name server (SNS) implemented in switch and director.

SAN is a self-configuring network. Switches automatically negotiate with the rest of the fabric for address allocation. A device connecting to the fabric perform a logon to obtain a unique 3-byte address. The device then register with SNS to inform it of its capabilities. Host can query SNS for accessible targets.

Support services such as LUN masking, zoning and stat change notification provide control over the fabric. Zoning places authorized storage devices and serves into a common communication domain. Devices outside the zone cannot access those inside the zone. Zoning manipulate the fabric's response to SNS queries by servers so that only authorized devices are returned.

Registered state change notification (RSCN) is a mean to alert entry or disappearance of stirage asets on the fabric. Upon alert, hosts can query SNS for additional devices.

SCSI Interconnect

The original SCSI interconnect was implemented with 8 data lines and a few control lines. The parallel bus architecture evolved to higher bandwidth via wider data path and higher clock speed. The signal skew problem limited the distance of the parallel SCSI interconnect. Also, each daisy chained SCSI string is limited to 16 SCSI ID, which caps the capacity.

For direct attached SCSI configuration, SAS addresses the limitation of parallel SCSI.

SCSI Architecture

SAM-2 (SCSI Architecture Model) defines the relationship between initiator and target. Serial SCSI implementation such as Fibre Channel, serial-attached SCSI (SAS) and iSCSI are a component of the SAM-2 definition for SCSI-3 commands.

The client-server requests and responmses are exchanged across some form of physical transport governed by SCSI-3 service deliveryt protocol such as FC or iSCSI.

Read/Write of data are performed with a series of SCSI commands, delivery requests, delivery actions and responses. SCSI commands and parameters are specified in the Command Descriptor Block (CDB). CDB is encapsulated in the FCP IU (information unit).

The operating system views a WRTIE operation as a single operation but underneath, there are multiple SCSI exchanges:

(1) WRITE trigger the creation of a client in the initiator.
(2) Initiator in turn issues a SCSI command request to the target to prepare a buffer.
(3) The target device server issues a delivery action request when the buffer is ready.
(4) The initiator sends data block
(5) After the data is recieved, the target sends another delivery action request to ask for anotherdata block.
(6) When all data blocks have been recieved, the target sends a Response to mark the end of the WRITE operation

For READ operation, the request and response directions are reversed, with the host prepare buffer for data from the disk.

SCSI LUN

SCSI (Small Computer System Interface) is implemented in a client/server model. Computer acts as client (initiator) and storage device acts as server (target). The SCSI command processing entity within the storage target represents a logical unit and assigned a number (LUN). SCSI targets are assigned a 3-part bus/target/LUN descriptor. Bus refer to one of the several SCSI that installed on the hot (e.g. HBA, iSCSI network card). Bus supported multiple daisy-chained disks (target). LUN represents a SCSI server within the target.

Operating systems expect to boot from LUN 0. For multiple server to boot from the same disk array, the array needs to support multiple LUN 0s. This is done by mapping the actual LUN to virtual LUN. LUN mapping also include LUN masking used to isolate set of LUN from other hosts or users.