Process and Process Management
October 29, 2016
Categorised in: Operating System Design
What is a Process?
A program in execution
An instance of a program running on a computer
The entity that can be assigned to and executed on a processor
A unit of activity characterized by the execution of a sequence of instructions, a current state, and an associated set of system instructions
The system consists of a set of processes: user processes (performed by user code) and OS processes (running system code).
Program is passive entity stored on disk, process is active
-Program becomes process when executable file loaded into memory
A process is comprised of:
-A set of data
-A number of attributes describing the state of the process
While the process is running it has a number of elements including:
-I/O status information
Process Control Blocks
Contains the process elements
Created and manage by the operating system
Allows support for multiple processes
Process may be in one of two states: (1) Running, (2) Non-Running
Identifier: A unique identifier associated with this process, to distinguish it from all other processes.
State: If the process is currently executing, it is in the running state.
Priority: Priority level relative to other processes.
Program counter: The address of the next instruction in the program to be executed.
Memory pointers: Includes pointers to the program code and data associated with this process, plus any memory blocks shared with other processes.
Context data: These are data that are present in registers in the processor while the process is executing.
I/O status information: Includes outstanding I/O requests, I/O devices (e.g., tape drives) assigned to this process, a list of files in use by the process, and so on.
Accounting information: May include the amount of processor time and clock time used, time limits, and so on.
As a process executes, it changes state
new: The process is being created
running: Instructions are being executed
waiting: The process is waiting for some event to occur
ready: The process is waiting to be assigned to a processor
terminated: The process has finished execution
A Five-State Model
Running: The process that is currently being executed.
Ready: A process that is prepared to execute when given the opportunity.
Blocked/Waiting: A process that cannot execute until some event occurs, such as the completion of an I/O operation.
New: A process that has just been created but has not yet been admitted to the pool of executable processes by the OS. Typically, a new process has not yet been loaded into main memory, although its process control block has been created.
Exit: A process that has been released from the pool of executable processes by the OS, either because it halted or because it aborted for some reason.
Process States and Transitions
The process is executing in user mode
The process is executing in kernel mode
The process is not executing but is ready to run as soon as kernel schedules it
Process is sleeping and resides in main memory
Process is ready to run but swapper must swap it in main memory before kernel schedules it for execution
Process is sleeping and the swapper has swapped the process to secondary storage to make room for other processes in main memory
Process is returning from kernel mode to user mode, but kernel preempts (stop/pause) it and schedule another process
Process is newly created and is in a transition state. It exists but it is not ready to run nor it is sleeping
Process executed exit system call and no longer exists, final state of the process
Data Structures describing state of process
Process Table Entry:
contains fields accessible to kernel
contains fields accessible to running process
Kernel allocates space for the u area only
when creating a process
Layout of System Memory
Kernel Process table
- Describes state of every active process
- Contains general fields of processes that must be always be accessible to the kernel
- U-area is associated with executing process, processaccesses all the process related information and set upenvironment that process can execute easily
- Contains additional information that controls operation of a process
Process Table Entry
The state field identifies the process state.
The process table entry contains fields that allow the kernel to locate the process and its u area in main memory or in secondary storage (Pointer to u area).
Process size : kernel knows how much space to allocate for the process
The u area contains the following fields that further characterize the process states.
- Timer fields record the time the process (and its descendants) spent executing in user mode and in kernel mode.
- An error field records errors encountered during a system call.
- A return value field contains the result of system calls.
I/O parameters describe the amount of data to transfer, the address of the source (or target) data array in user space, and so on.
The current directory and current root describe the file system environment of the process.
Kernel Region Table
System has 3 logical sections : text, data and stack
The text section contains the set of instructions the machine executes for the process; addresses in the text section include text addresses (for branch instructions or subroutine calls), data addresses (for access to global data variables), or stack addresses (for access to data structures local to a subroutine).
A region is contiguous area of virtual address space of process that can be treated as shared or protected
Kernel region table contains the pointer to the page table which keeps the physical memory address.
Text, data, and stack usually form separate regions of a process.
Several processes can share a region. For instance, several processes may execute the same program, and it is natural that they share one copy of the text region.
Similarly, several processes may cooperate to share a common shared-memory region.
Per Process Region Table (Pregion)
Each process contains a private per process region table, called a pregion for short.
Pregion entries may exist in the process table, the u area, or in a separately allocated area of memory, dependent on the implementation
Each pregion entry points to a region table entry and contains the starting virtual address of the region in the process.
Shared regions may have different virtual addresses in each process.
The pregion entry also contains a permission field that indicates the type of access allowed the process: read-only, read-write, or read-execute.
Figure shows two processes, A and B, showing their regions, pregions, and the virtual addresses where the regions are connected.
The processes share text region ‘a’ at virtual addresses 8K and 4K, respectively.
If process A reads memory location 8K and process B reads memory location 4K, they read the identical memory location in region ‘a’.
The data regions and stack regions of the two processes are private.
The concept of the region is independent of the memory management policies implemented by the operating system.
Memory management policy refers to the actions the kernel takes to insure that processes share main memory fairly.
Pages and Page Tables
In a memory management architecture based on pages, the memory management hardware divides physical memory M to a set of equal-sized blocks called pages.
Typical page sizes range from 512 bytes to 4K bytes and are defined by the hardware.
Every addressable location in memory is contained in a page and, consequently, every memory location can be addressed by a (page number, byte offset in page) pair.
When the kernel assigns physical pages of memory to a region, it need not assign the pages contiguously or in a particular order.
The purpose of paged memory is to allow greater flexibility in assigning physical memory, analogous to the assignment of disk blocks to files in a file system.
Just as the kernel assigns blocks to a file to increase flexibility and to reduce the amount of unused space caused by block fragmentation, so it assigns pages of memory to a region.
The kernel correlates the virtual addresses of a region to their physical machine addresses by mapping the logical page numbers in the region to physical page numbers on the machine, as shown in Figure.
Since a region is a contiguous range of virtual addresses in a program, the logical page number is the index into an array of physical page numbers.
The region table entry contains a pointer to a table of physical page numbers called a page table.
Page table entries may also contain machine-dependent information such as permission bits to allow reading or writing of the page.
The kernel stores page tables in memory and accesses them like all other kernel data structures.
Figure shows a sample mapping of a process into physical memory.
Assume that the size of a page is 1K bytes, and suppose the process wants to access virtual memory address 68,432.
The pregion entries show that the virtual address is in the stack region starting at virtual address 64K (65,536 in decimal), assuming the direction of stack growth is towards higher addresses.
Subtracting, address 68,432 is at byte offset 2896 in the region.
Since each page consists of 1K bytes, the address is contained in page 2 (counting from 0) of the region, located at physical address 986K.
Parent process create children processes, which, in turn create other processes, forming a tree of processes
Generally, process identified and managed via a process identifier (pid)
To create a new process in the UNIX operating system is to invoke the fork system call
The process that invokes fork is called the parent process, and the newly created process is called the child process.
The syntax for the fork system call is
pid = fork();
Resource sharing options
- Parent and children share all resources
- Children share subset of parent’s resources
- Parent and child share no resources
- Parent and children execute concurrently
- Parent waits until children terminate
The kernel does the following sequence of operations for fork.
1. It allocates a slot in the process table for the new process.
2. It assigns a unique ID number to the child process.
3. It makes a logical copy of the context of the parent process, Since certain portions of a process, such as the text region, may be shared between processes
4. It increments file and mode table counters for files associated with the process.
5. It returns the ID number of the child to the parent process, and a 0 value to the child process.
There must be some way that a process can indicate completion.
This indication may be:
- A HALT instruction generating an interrupt alert to the OS.
- A user action (e.g. log off, quitting an application)
- A fault or error
- Parent process terminating
Process executes last statement and then asks the operating system to delete it using the exit() system call.
- Returns status data from child to parent (via wait())
- Process’resources are deallocated by operating system
Awaiting Process Termination
A process can synchronize its execution with the termination of a child process by executing the wait system call.
The syntax for the system call is
pid =wait(stat addr);
where pid is the process ID of the terminated child, and stat addr is the address user space of an integer that will contain the exit status code of the child.
Context of a Process
The context of a process consists of the contents of its (user) address space and the contents of hardware registers and kernel data structures that relate to the process.
Formally, the context of a process is the union of its user-level context, register context, and system-level context.
The user-level context consists of the process text, data, user stack, and shared memory that occupy the virtual address space of the process.
The register context consists of the following components:
- The program counter specifies the address of the next instruction the CPU will execute
- The processor status register (PS) specifies the hardware status of the machine as it relates to the process
- The stack pointer contains the current address of’ the next entry in the kernel or user stack
The general-purpose registers contain data generated by the process during its execution
The system-level context of a process has a “static part” and a “dynamic part”
Process has one static part of the system-level context throughout its lifetime, but it can have a variable number of dynamic parts
The dynamic part of the system-level context of a process consists of a set of layers, visualized as a last-in-first-out stack
Figure describes the components that form the context of a process.
The left side of the figure shows the static portion of the context.
It consists of the user level context, containing the process text (instructions), data, stack, and shared memory (if the process has any), and the static part of the system-level context, containing the process table entry, the u area, and the pregion entries (the virtual address mapping information for the user-level context).
The right side of the figure shows the dynamic portion of the context
It consists of several stack frames, where each frame contains the saved register context of the previous layer, and the kernel stack as the kernel executes in that layer.
System context layer 0 is a dummy layer that represents the user-level context
Saving the Context of a Process
The kernel saves the context of a process whenever it pushes a new system context layer.
In particular, this happens when the system receives an interrupt, when a process executes a system call, or when the kernel does a context switch.
Interrupts and Exceptions
The system is responsible for handling interrupts, whether they result from hardware (such as from the clock or from peripheral devices), from a programmed interrupt (execution of instructions designed to cause “software interrupts”), or from exceptions (such as page faults).
The kernel handles the interrupt with the following sequence of operations:
- It saves the current register context of the executing process and creates a new context layer.
- It determines the “source” or cause of the interrupt, identifying the type of interrupt (such as clock or disk) and the unit number of the interrupt, if applicable (such as which disk drive caused the interrupt).
- When the system receives an interrupt, it gets a number from the machine, commonly called an interrupt vector.
- The contents of interrupt vectors vary from machine to machine, but they usually contain the address of the interrupt handler for the corresponding interrupt source and a way of finding a parameter for the interrupt handler.
- The interrupt handler completes it work and returns.
System Call Interface
- The library functions typically invoke an instruction that changes the process execution mode to kernel mode and causes the kernel to start executing code for system calls
- Popular system calls are open, read, write, close…
Categories of system call
- Process Control: load, execute…
- File management: create file, delete file, open, close…
- Device Management: request device…
- Information Maintenance: get/set time or date, get/set system data…
- Communication: create, delete communication connection, send, receive messages…
- A context switch is the process of storing and restoring the state (more specifically, the context) of a process so that execution can be resumed from the same point at a later time.
- The kernel permits a context switch under four circumstances: when a process puts itself to sleep, when it exits, when it returns from a system call to user mode but is not the most eligible process to run, or when it returns to user mode after the kernel completes, handling an interrupt.
Steps for Context Switch
- Decide whether to do a context switch and is it permissible now
- Save the context of a the old process
- Find the best process to schedule for execution using the process scheduling algorithm
- Restore its context
Manipulation of the Process Address Space
The process object can be static or dynamic, if the object is static then requirement does not changes during its execution
If the object is dynamic, then the process requirement changes during its execution
Range of memory allocated to the process is considered as virtual address space
Manipulation of the address space can be done using different system calls provided by the system
Manipulation of process address space is possible by different operations on regions
Different system calls are as follows:
- Locking and unlocking region-kernal can lock and allocate a region and later unlock it. Similarly if it wants to manipulate an allocated region, it can lock the region to prevent access by other processes and later unlock it
- Allocating a region-the kernel allocates a new region during fork
- Attaching region to a process-the kernel attaches a region during the fork system call to connect it to the address space of a process
- Changing the size of the region- a process may expand or contract its virtual address space with sbrk system call (memory management system call)
- Loading a region- to load the contents of the file
- Freeing a region-when a region is no longer attached to any processes, the kernel can free the region and return it to the list of free regions
- Detaching a region from a process-the kernel detaches regions in the exit and shmdt (detach shared memory) system calls
- Duplicating a region- The fork system call requires that the kernel duplicate the regions of a process. If a region is shared (shared text or shared memory), however, the kernel need not physically copy the region; instead, it increments the region reference count, allowing the parent and child processes to share the region. If the region is not shared and the kernel must physically copy the region
sleep, which changes the process state from “kernel running” to “asleep in memory,” and wakeup, which changes the process state from “asleep” to “ready to run” in memory or swapped
When a process goes to sleep, it typically does so during execution of a system call: The process enters the kernel (context layer I) when it executes an operating system trap and goes to sleep awaiting a resource.
When the process goes to sleep, it does a context switch, pushing its current context layer and executing in kernel context layer 2
Sleep System Call
Sleep is a system call in UNIX which suspends program execution for specified period of time
We can use sleep system as follows:
Sleep 5s //sleeps for 5 second
Sleep 3m //sleeps for 3 minutes
Processes are said to sleep on an event, meaning that they are in the sleep state until the event occurs, at which time they wake up and enter a “ready-to-run” state (in memory or swapped out).
The abstraction of the event does not distinguish how many processes are awaiting the event, nor does the implementation.
Events: “waiting for the buffer” to become free
“awaiting I/O completion”, “waiting for inode”, waiting for terminal input”
Invoking Other Programs
Invoking other programs means staring other program or processes from already running processes
Process may call other programs for its completion and process may invoke other program using different methods: open, exec
Invoking the Program with open
Open invoke the program with I/O connected to a file descriptors, same call which is used to open the file
Open will run the program with standard input or output connected to a file descriptor
Invoking the Program with exec
Used to run an executable file from already running process
When the new process starts, it replaces the existing running process, fork system call is used create new process
The action of replacing the running process with new process is called overlay
Exec is available in many programming languages and exec comes in command and function format
One of the important concepts in operating systems
Requires for easy functioning of the operating system
Signal is software generated interrupt to processes
Signals inform processes of the occurrence of asynchronous events
Asynchronous events are typically due to external events at the interaction layer between the hardware and the operating system
The signal, itself, is the way for the operating system to communicate these events to the processes
Kind of Inter-process communication
Processes may send each other signals with the kill system call, or the kernel may send signals internally
There are 19 signals in the System V (Release 2) UNIX system
Signals having to do with the termination of a process, sent when a process exits
Event notification sent to a process at anytime
An event generates a signal
OS stops the process immediately
Signal handler executes and completes
The process resumes where it left off
Signals having to do with process induced exceptions such as when a process accesses an address outside its virtual address space, when it attempts to write memory that is read-only
Signals having to do with the unrecoverable conditions during a system call, such as running out of system resources
Signals caused by an unexpected error condition during a system call
Signal starts with SIG keyword
E.g. SIGSTOP Action-> Stop: Stop process
The kernel handles signals in the context of the process that receives them
There are three cases for handling signals:
- the process exits on receipt of the signal,
- it ignores the signal, or
- it executes a particular (user) function on receipt of the signal.
The default action is to call exit in kernel mode, but a process can specify special action to take on receipt of certain signals with the signal system call.
User ID of a Process
The kernel associates two user IDs with a process, independent of the process- the real user ID and the effective user ID or set uid (set user ID).
The real user ID identifies the user who is responsible for the running process.
The effective user is used to assign ownership of newly created files, to check file access permission; and to check permission to send signals to processes via the kill system call.
The kernel allows a process to change its effective user ID when it executes a setuid program
A setuid program is an executable file that has the setuid bit set in permission mode file
The syntax for the setuid system call is
where uid is the new user ID, and its result depends on the current value of the effective user ID
Changing a Size of a Process
A process may increase or decrease the size of its data region by using the brk system call
The syntax for the brk system call is
brk (endds) ;
where endds becomes the value of the highest virtual address of the data region of the process (called its break value).
If the data space of the process increases as a result of the call, this newly allocated data space is virtually contiguous to the old data space; that is, till virtual address space of the process extends continuously into the newly allocate data space
Pratik Kataria is a budding programmer, web designer and developer.