Friday, September 5, 2014

Process Creation


Linux implements fork() using clone() system call.  The call is passed with flags which specifies what are to be shared between the parent and the child process.  The vfork() and _clone() lib call all invoke clone().  clone() invoke do_fork() which does the bulk of work.  do_fork() called copy_process() which calls

  • dup_task_struct() creates a new kernel stack, thread_info and task_struct for the new process.  The parent and child process descriptors are initially the same.  Various fields are then set or cleared.  The task state is set to TASK_UNINTERRUPTIBLE to make sure it is not run yet.
  • copy_flag() to copy the flags member of the task_struct.  It unset the PF_SUPERPRIV for task that does not run supervisor privileges.  It also set PF_FORKNOEXEC to indicate that it has not called exec() yet.
  • get_pid() to get the next available pid

Depending on the flags passed to clone(), copy_process() duplicates open files, filesystem info, signal handlers, process address space and namespace.  Then the remaining timeslice is split between the parent and the child.  copy_process returns a pointer of the child to do_fork().  The new child process is then waken up and run ahead of the parent.

vfork() has same effect as fork except the page table entries of the parent process are not copied.  The child process is not allowed to write to the address space.  The child runs in the parent address space as the sole thread.  The parent process is stopped until the child runs exec() or exits.  In do_fork(), a flag vfork_done is set to point to a specific address.  When the child exit, the vfork_done is checked and if not NULL will send a signal to the parent.

vfork() is an optimized form for 3BSD at the time when the copy-on-write is not available.

No comments: