Sunday, October 5, 2014

Linux files and directories

File or regular file is a stream of bytes.  There is no structure imposed on file like other operating systems.  The length of a file is bound by the C type used to store the file position (or offset).  The length of the file can be changed by truncating the file.  Truncation can cut the file short or make it longer.  For the latter case, zero bytes are filled in from the original EOF to the new end point.

File can be opened multiple times by different processes or event the same process.  Linux does not regulate access by different processes and is up to the processes to coordinate themselves.

A file is referenced by the inode (information node).  Inode is identified by a number which is unique within a file system only.  An inode contains meta information about file such as timestamps, owner, type, length and location on the disk.  Filename is not stored in inode.  Inode is stored physically on disk.

Directories provide the names used by user to access file.  It maps the file name to inode number.  A name-inode pair is called a link.  Link is implemented physically on disk as a table.  Conceptually, directory is also a file.

When a file is referenced, kernel walks the full pathname to find the inode of the next level of directory entry (dentry).  The kernel cache the dentry in the dentry cache to ease future lookup.

Although directory is a regular file, Linux kernel does not allow it to be manipulated by the usual set of file operations (e.g. open, read etc.).  Directory is processed by its own set of system calls.

No comments: