Saturday, November 23, 2013

C Pointers

A pointer is an integer which designates the location in memory.

int* pA ; // declares a pointer to an integer in a pointer variable. 

The position "*" is flexible.  You can declare a pointer as such too:

int * pA or int *pA

The form int* is more natural as it is easy to call out the type of pA is int*

To initialize a pointer to NULL,

int *pA = 0 or int *pA = NULL

Note that *pA when used outside of declaration means the thing pointer pA points to.  This allows you to access the item at the far end of the pointer and "*" is called dereferencing the pointer.

An object is implemented using pointer.  However, you never dereferencing an object when used.  You just use the name of the object.

Pointer of type void* is a generic pointer.  It can point to anything.  Effectively, pointer to void bypass type checking.  Both array and string in C are pointers.

To obtain a pointer from a variable, use the address (&) operator.  For example

int result = 0;
pA = &result;

int** ptr; // declare a pointer to a pointer

To create a pointer to a function, just use the function name without the parameters.  For example

int square(int a, int b);
&square is the pointer to the function square

Address Conversion

(1) Printable to Numeric: int inet_pton(int addressFamily, const char *src, void *dst)

(2) Numeric to Printable: const char* inet_ntop(int addressFamily, const void *src, char *dst, socklen_t dstBytes)

The dst is a pointer to a block of memory allocate in the caller space to hold the result.  The size of the block is determined by the address family.

For examples,

struct sockaddr_in servAddr;
int result = inet_pton(AF_INET, servIP, &servAddr.sin_addr.s_addr);

struct sockaddr_in clientAddr;
char clientName[INET_ADDRSTRLEN];  // INET_ADDRSTRLEN6 for IPV6
char *paddr = inet_ntop(AF_INET, &clintAddr.sin_addr.s_addr, clientName, sizeof(clientName));


sockaddr structure

The socket API specifies a generic data type called sockaddr for used by API calls.

struct sockaddr {
    sa_family_t sa_family;  // address family e.g. AF_INET or AF_INET6
    char sa_data[14];    // address info - A blob of bits to handle diff OS and network
};

Note that this sockaddr structure is not large enough to handle a IPV6 address which is 16 bytes long.  The actual data structure used in socket call are sockaddr_in (for IPV4) and sockaddr_in6 (for IPV6).  They have just a more detailed layout of sockaddr.

struct in_addr { uint32_t s_addr; }; // 4-byte IPV4 address

struct sockadr_in {
    sa_family_t sin_family;  //address family AF_INET
    in_port_t sin_port;    //16-bit port
    struct in_addr sin_addr;
    char sin_zero[8];    //padding
};

struct in_addr { uint32_t s_addr[16]; };  //128-bit address

struct sockadr_in6 {
    sa_family_t sin6_family;  //address family AF_INET6
    in_port_t sin_port;    //16-bit port
    uint32_t sin6_flowinfo;  //flow info
    struct in6_addr sin6_addr;
    uint32_t sin6_scope_id;    //scope ID
};

The structure is casted with (struct sockaddr *) when used.  For example,

result = bind(servSock, (struct sockaddr*) &servAddr, sizeof(servAddr));

As sockaddr_in is not big enough to hold a IPV6 address, program allocate space using a sockaddr_storage structure

struct sockaddr_storage { sa_family_t .... } ;  //the sa_faimily is used to determine the actual address type.

struct sockaddr_storage sockAddr
:
:
switch (sockAddr->sa_family) {
    case AF_INET: ...
    case AF_INET6: ...
    default: ...
};

Sunday, November 17, 2013

Socket

It is a general abstraction through which programs send and receive data.  Different types of socket correspond to different underlying protocol suites and different stacks of protocol within the suite.

The main types of TCPIP socket are stream socket and datagram socket.  A stream socket represents one end of the TCP connection.  It consists of an IP addressm a port number and the end to end protocol (TCP).

A socket is created by a socket call which returns a handle to the socket:

int socket(int domain, int type, int protocol)

"Domain" refers to the communication domain, recall that socket API is a generic interface for a large number of communication domains (e.g. AF_INET for IPV4 and AF_INET6 for IPV6).

HSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)

"Type" determines the semantics of the data transmission with the socket.  For example, if the transmission is reliable or message boundary is preserved etc.  Valid values are SOCK_STREAM or SOCK_DGRAM.

"Protocol" refers to the end to end protocol to be used.  Valid values are IPPROTO_TCP or IPPROTO_UDP.  A value of 0 means to use the default protocol for the "Type".

The close() call close the socket.

Special Network Addresses

(1) Loopback address 
It is assigned to a loopback interface which is a virtual device that echoes transmitted packets back to the sender.  For IPV4, it is 127.0.0.1 and for IPV6, it is ::1.

(2) Private addresses
This group of address is for used by locations which connect to internet via NAT.  These addresses cannot be reached from the global internet.  For IPV4, they start with 10 or 192.168 or 172.16-31.  There is no correspondence for IPV6.

(3) Link Local or Autoconfiguration addresses
These addresses can only be used to communicate with hosts on the same network.  Routers will not forward these addresses.  For IPV4, it is 169.254.  For IPV6, it is start with FE80, FE90, FEA0 and FEB0.

(4) Multicast addresses
For IPV4, it is 224. to 239.  For IPV6, it start with FF.

JVM

It has a stack based architecture without registers.  This allow JVM to run the same code regardless of underlining hardware.  Real hardware machines differs in number and size of registers and how they relate to memory.  The only register like structure is the program counter.  Result of method call is returned on stack.

Mutex

It is referred as Mutants when in the kernel.  Mutexes are global objects for syncronizing execution.  Mutex names are usually hard-coded because the name must be consistent if it is used by 2 processes or threads.  Only one thread can own a mutex at any one time.  Thread gains access to mutex using WaitFor SingleObject.  ReleaseMutex call release the mutex after use.  CreateMutex function creates a mutex.  The other thread uses OpenMutex to obtain a handle to the mutex before using it.

First and Second Chance Exceptions

Debuggers are given 2 chances to handle an exception of the program being debugged.  When an exception occurs, the execution of the program will stop and the debugger is given a first chance to handle the exception.  The debugger can handle it or choose to pass it on to the program.  In the latter, the program registered exception handler will be given control.

If the program does not handle the exception, the debugger is given a second chance to handle the exception.  If there is no debugger attached, the program will usually crash at this point.  The debugger must resolve the exception to enable the program to continue to run.

Breakpoint

Software breakpoints are implemented by overwriting the instruction at the break location with 0xCC which is a INT 3 instruction.  This allows control passed to the debugger when execution reach that point.  The debugger will show the instruction before patching but if one inspect the memory, the value has changed to INT 3.

Software breakpoints may not work when a code is self modifying (e.g. malware).  In this case, the patch may be overwritten and the breakpoint will not be effective

Hardware breakpoints are assisted by hardware.  For each instruction being executed, hardware will compare the address with the special register to determine if a breakpoint is reached.  One major drawback is that there are only 4 debug register in x86.  DR0 to DR3 store the addresses of breakpoints.  DR7 is the control register which indicates if any of the DR0-3 is active and if the address represent a read, write or execute breakpoint.  Read/write breakpoint allow the program to break out when an address is referenced.

To protect the DR from modified by malware, set the General Detect flag in DR7.  It will break prior to any mov instruction that modify the DR0-3.

Conditional breakpoint breaks when certain predefined condition is reached.  For example, break when the second parameter of a function is of a particular value.  This facilitate debugging to stop frequently executed point only on condition of interest.  Conditional breakpoints are implemented as software breakpoints

Stack Layout

ESP points to the top of the stack.  EBP is usually not change during the call to provide a reference point to access local variable using offset.

(1) arguments was pushed onto the stack first
(2) Next is the return address is pushed automatically because of the CALL instruction
(3) The old EBP is pushed next
(4) Lastly the local variable is allocated

pusha and pushad push a set of 16- and 32-bit registers onto the stack - EAX, EBX, ECX,EDX, EBP, ESP, ESI and EDI.

ESP always points to the top element in the stack.

NOP (Intel)

Actually a XCHG EAX,EAX instruction. Opcode is 0x90.  NOP is commonly seen in buffer overflow hack when the exact code address can only be approximate.  So lacing a series of NOP allow the code jump to complete.

Windows Thread

Threads share the address space of the process.  Each thread has its own stack and registers.  When OS switches thread, the CPU context is stored in a structure called thread context.

CreateThread fucntion create a new thread.  The function call specify a start address of the program to be executed.  If the start address is LoadLibrary call, the DLLMain will be executed after the DLL is loaded

Windows Network API

Berkeley Compatible Sockets function similar to UNIx.  It is implemented in the Winsock libraries, primarily in ws2_32.dll.  Common socket functions:

socket - create a socket
bind - attach a socket to a port
listen - start a socket to listen to a port
accept - open a connection to a remote socket and accept the connection
connect - open a connection to a remote socket which is waiting for a connection
recv - receive data
send - send data

Prior to use these function, the WSAStartup function must be call to load the network library and allocate resources.

WinINet is a higher level API which implement HTTP and FTP protocols.  It is implemented in Wininet.dll.

InternetOpen - initialize a connect to Internet
InternetOpen Url - open a connection to HTTP or FTP site
InternetReadFile - retrieve a file from the site

reg File

File with reg suffix is a readable text file.  When user double-click the reg file, the content will be automatically merge with the registry.  For example, the following add a program to run automatically when Windows starts:

Windows Registry Editor Version x.xx

[HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run]
"abcvalue"="C:\abc.exe"

Alternate Data Stream (ADS)

It is a feature allows additional data to be added to existing file in NTFS, essentially adding 1 file to another.  The extra data does not show up in DIR command listing.  It is not visible when the file is browsed or edited.  Program can access the stream via the name file.txt:Stream:$DATA

Long Pointer (LP)

Strings are usually named as lp (e.g lpStr1) as they really point to memory location where the strings start.  LP is 32-bit.  P (pointer) is same as LP in 32-bit systems. They only make a difference in 16-bit system.

Windows Handles

Like pointers, handle refer to object or memory location. However, handles cannot be used in arithmatic operations and they do not always represent memory addresses. They can only be used in function calls to refer to the same objects.