The Environment of a UNIX Process

 

-  Memory layout of a C/C++ program

 

·       Text segment

 

·       Initialized data segment

 

·       Uninitialized data segment

     

°          bss  (block started by symbol)

 

·       Stack

     

°          For function calls.

 

·       Heap

 

°          For dynamic memory allocations.

 

·       Command-line arguments

 

argc/argv[]/envp[]

 

·       Environment variables

 

extern char **environ

 


 

 

High Address

argc/agrv[]

Command-line arguments and environment variables

 

Stack

 

 

¯

 

 

­

 

 

Heap

 

 

Uninitialized data

(bss)

Initialized to zero by exec()

 

Initialized data

 

Read from program file by exec()

 

Low address

Text

  

Typical logical memory layout of a process

 

Text consists of program code and library code, which may be shared with other processes running the same code


UNIX Processes

 

 

-  Process

 

·       A program in execution.

 

·       Each has a unique PID

 

°          A non-negative integer: 0 ~ PID_MAX

 

·       Created by fork()/vfork() system calls

 

·       Some special PIDs

 

0: scheduler

1: init

2: pagedaemon

 

-  The fork() system call

 

·       Only way to create processes

 

Except for 0, 1, 2, …

 


Typical code for fork

 

       pid_t = new_pid;

 

   new_pid = fork();

 

   switch (new_pid) {

 

      case –1: /*error, in parent */

         … /* do error stuff */

         break;

 

      case 0: /* in child */

           /* do child stuff */

         break;

 

      default: /* in parent */

         … /* do parent stuff */

         break;

   }

 


 

·       Parent/child relationship

     

°       The child is a copy of the parent

 

F       Inherits the parent's data, heap and stack.

 

°       COW (copy-on-write) in most current implementations

 

F       Only the page that gets modified is copied, typically in a virtual memory system.

 

°       Often the parent and the child share the text segment

 

F       If it is read-only.

 

·       Never know whether the parent or child will start executing first.

 

·       All file descriptors that are open in the parent are duplicated in the child

 

°          They also share the same file offset (Files opened after fork are not shared).


 

·       Two normal cases for handling the descriptors after a fork()

 

°          Parent waits.

.

°          Parent and child go their own way.

 

·       fork() may fail if it

 

°          Exceeds user limit.

 

°          Exceeds total system limit.

 

·       Two uses (reasons) for fork()

 

°          Each can execute a different sections of the code at the same time.

 

°          One process can execute a different program.

 

-  The vfork() system call

 

·       A BSD variant of fork(), now supported by SVR4.

 

·       Similar to fork(); however, is used to exec a new program only.


 

·       Child running in the parent address space until it calls exec()/exit().

 

·       Not fully copying the address space of the parent into the child.

 

·       vfork() guarantees that the child runs first until it calls exec()/exit().

 

·       Deadlock is possible if the child needs information from the parent.

 

-  Process termination

 

·       Normal termination

 

°          Return from main().

 

°          Calling exit().

 

°          Calling _exit().

 

·       Abnormal termination

     

°          Calling abort().

 

°          Terminated by a signal.


-  The exit() system call

 

·       Performs a standard I/O cleanup

 

°          Executes all registered exit handlers.

 

°          Flushes all C/C++ output buffers.

 

°          Closes all open streams.

 

·       Terminates the calling process.

 

-  The _exit() system call

 

·       Terminates the calling process without performing some cleanup.

 

-  Various wait() system calls

 

·       wait() is used to wait for the first child to terminate.

 

·       waitpid()  is used to wait for a specific child to terminate, plus some options.

 

·       wait3()/wait4() will further collect resource usage information.

 

Function

pid

options

rusage

POSIX.1

SVR4

4.4BSD

wait

waitpid

 

    ·

 

     ·

 

      ·

      ·

      ·

      ·

      ·

      ·

wait3

wait4

 

    ·

     ·

     ·

      ·

      ·

 

      ·

      ·

      ·

 

Arguments supported by various wait functions on different systems.

 

·       When a process terminates, the following is reported/returned to its parent:

 

°          Exit status.

°          Some timing statistics (CPU time consumed).

°          Etc.

 

 

-  Zombie process

 

·       A process that no longer exists, but still ties up a slot in the system process table.

 

·       A process that has terminated, but whose parent exists and has not waited/acknowledged the child's termination.


-  Orphaned process (orphan)

 

·       A process whose parent has exited.

 

·       An orphaned process can never become a zombie process

 

°          Its slot in the process table is immediately released when an orphan terminates.

 

·       Orphaned processes are inherited by init.

 

-  Race conditions

 

·       Occur when multiple processes are competing for the same system resource(s)

 

°          The final outcome depends on the order in which the processes run.

 

·       Problems due to race conditions are hard to debug

 

°          Programs tend to work “most of the time.”

 

·       Needs to have process synchronization.


-  Process attributes

 

·       A process has the following IDs

 

°          Process ID.

 

°          Parent Process ID.

 

°          Process group ID.

 

°          Session ID.

 

°          User ID of the process.

 

°          Group ID of the process.

 

°          Effective user ID.

 

°          Effective group ID.

 

·       Some other properties

 

°          Controlling terminal.

 

°          Current working directory.

 

°          Root directory.

 

°          Open files descriptors.

 

°          File mode creation mask.

 

°          Resource limits.

 

°          Process times.


-  Two kernel data structures pertinent to a process

 

·       The process table entry and user (u) area

 

°          Containing administrative information for a process.

 

°          One each per process.

 

·       Process table entry

 

°          Keeping information always needed.

 

·       User area

 

°          Keeping information needed when running.

 

-  The context of a process

 

·       User address space.

 

·       Relevant kernel data structures

 

°          Process table entry + u area.

 

·       Contents in hardware registers.

 


Process Status command, ps

 

       Lists status information from process table

 

       Many options

 

 

system()

 

Execute a command with /bin/sh from a program

 

  Program waits for command

 

  System returns program exit code

 

  Examples:  system1.c, system2.c

 

  Simple but not elegant or efficient

 

Example: Process timing with times()
-  The exec() system call

 

·       Only way to execute processes

 

In the UNIX system, fork() creates processes and exec() executes processes. These two system calls are very closely related. Without exec(), no process can be executed. No fork(), no process can be created. They make a good team achieving most of the UNIX system operations.

 

·       Will replace the calling process with a new program and start execution.

 

·       Brand new text, data, heap and stack segments.

 

·       Inherits most of the process attributes of the calling process, such as

 

°          PID and PPID.

 

°          The real and effective UID and GID that aren’t SUID or SGID.

 

°          Open files, except those with the close-on-exec flag set, are passed to the new program.

 

°          The file mode creation mask (umask) is passed to the new program.

 

°          Controlling terminal.

 

°          Current working directory

 

°          Root directory.

 

°          File locks.

 

°          Signal mask.

 

°          Pending signals.

 

°          Resource limits

 

°          CPU times.


 

·       Is a family name for six like functions virtually doing the same thing, only slightly different in syntax:

 

°          execl(), execv(), execle(), execlp(), execvp(), and execve().

 

F       Only execve() is a system call.

 

°          Meaning of different letters:

 

l:  needs a list of arguments.

 

v: needs an argv[] vector   (l and v are mutually exclusive).

 

e: needs an envp[] array.

 

p: needs the PATH variable to find the executable file.

 

 

          execlp                              execl                              execle

 

 


                argv                                 argv                                 argv

 

         execvp                             execv                              execve

                           PATH                              environ

 

                    Relationship of the exec() functions.


Examples of different exec calls

 

Example code

 

pexec.c    -  Replaces a process image using the execlp command.

 

fork1.c     -  Duplicates a process image using the fork command.

 

wait.c      -  Waits for a child process to finish.

 

fork2.c    -  fork1.c changed so thata zombie process is created.


Redirection in exec’ed programs

 

Use freopen()

 

Example using filter upper.c: useupper.c

 

With fork() would have to reopen after fork to preserve stdin in parent.