Saturday, October 11, 2014

Synchronous I/O in Linux

int fsync(int fd) system call flushes the data to the disk synchronously.  The call flush both data and metadata such as creation timestamp and other attributes in the inode.

int fdatasync(int fd) system call flush the data and only metadata (e.g. file size) that is required to access the file in the future.

The sync() flushes data for all fd.

Alternately, passing O_SYNC to OPEN call causes every READ and WRITE to be synchronized IO.  It is like forcing a fsync() call after each IO but Linux implement this more efficiently.

Specifying O_SYNC will increase the CPU for WRITE and elapsed time of the process as IO wait time is included. Using fysnc() and fdatasync() is comparatively less overhead as the program can make these call at specify logic point and not after every IO.

POSIX also defined O_DSYNC and O_RSYNC flags for OPEN. These 2 flags is defined as O_SYNC in Linux.  By definition in POSIX, O_DSYNC is same as fdatasync().

O_RSYNC means READ and WRITE IO are synchronized.  READ is already always synchronized (it will not return unless some data is available for the caller).  O_RSYNC also stipuated that the metadata (file access time) associate with the READ call must also be updated to disk before READ returns.  Although this behaviour does not match O_SYNC, LINUX defined O_RSYNC as O_SYNC

No comments: