The Linux Programming Interface

These are my personal notes on the book The Linux Programming Interface by Michael Kerrisk (highly recommended).

Chapter 1: History and Standards

What is UNIX? It is a family of operating systems. The term Linux refers to UNIX-like operating systems which use Linus Torvalds' kernel. BSD and System V are other variants of UNIX (or just see this diagram to get the whole picture).

Some important milestones in the development of UNIX (or, what made UNIX unique and popular?):

  • It was one of the first operating systems to be written in a high level language, and that language was no other than C.
  • It was widely distributed to universities with documentation and source code.
  • Richard Stallman set to work in a free UNIX implementation. He hasn't produced a UNIX kernel yet, but his GNU project produced GCC, the GNU compiler collection.
  • Linus Torvalds started his own project to create a UNIX kernel, and soon other programmers joined.

Despite the common denomination, it became increasingly difficult to move programs from one UNIX implementation to another. Standardization efforts started in the late eighties. Acronyms to keep in mind: POSIX, SUS, LSB. More details:

  • C programming language: ANSI C was approved in 1989, revised in 1999 by ISO (C99).
  • POSIX (Portable Operating System Interface) is a group of standards developed under the auspices of the IEEE. It is based on UNIX but not UNIX specific.
  • FIPS is a US government standard that builds on POSIX.1. Most computer vendors conformed to FIPS since the government is a major purchaser of computers.
  • X/Open Company, a group of computer vendors, created SUS, the Single UNIX Specification in the mid 90’s.
  • 2001: POSIX 1003.1/2001 (aka SUSv3) consolidates and replaces previous standards.
  • Two levels of conformance: POSIX conformance and XSI conformance (POSIX + more interfaces and behaviors).
  • 2008: POSIX.1-2008 (aka SUSv4).
  • No Linux distribution is branded UNIX by The Open Group: conformance testing would be too expensive. There is de facto conformance.
  • The Linux Standard Base (LSB) is an effort to ensure binary compatibility among Linux distributions. (POSIX promotes source code compatibility) (hm, so that's where the lsb_release command comes from…)

Chapter 2: Fundamental Concepts

The Kernel

The kernel is a central piece of software that allocates and manages computer resources. (The Linux kernel is typically located at /boot/vmlinuz).

What does a kernel do?

  • It schedules processes. This way, a single CPU can be shared among many processes.
  • It manages memory. RAM must also be shared among processes.
  • It provides a file system.
  • It creates and terminates processes.
  • It manages access to devices.
  • It provides networking.
  • It provides a system API.
  • It may provide a multi-user environment, in which each user gets access to a share of disk storage and CPU.

It would be possible to execute programs without a kernel, but there would be no concept of multiple processes residing in a computer and devices wouldn't be abstracted away.

Modern processor architectures allow the CPU to operate in user mode and kernel mode. In user mode, the kernel memory area is protected and some critical operations are unavailable.

The Shell

The shell is a program that interprets commands. In UNIX, the shell is a user process, not a part of the kernel. Some important shell implementations: Bourne shell (sh), C shell (csh), Korn shell (ksh), Bourne again shell (bash).

Users and groups

A UNIX user has:

  • a unique login name.
  • a user ID (UID).

The file /etc/passwd also stores:

  • the group ID of the first group to which the user belongs.
  • the home directory in which the user is placed after logging in.
  • the login shell, which executes the user commands (this can be changed with chsh).

Users are organized into groups mainly to control access to files. Users may belong to many groups. Groups are defined in /etc/group, which stores:

  • a unique group name.
  • the group ID (GID).
  • a comma separated list of usernames.

The superuser is a special user with UID 0 and username root (usually). The superuser bypasses all permission checks in the system. (That's why it is so dangerous to keep executing commands as root! root access via ssh should be disabled since it is a user with a known username and all privileges).

UNIX keeps a single directory hierarchy with / at the root (compare with Windows, in which each drive has a hierarchy).

Some file types are: regular or plain files (just data), devices, pipes, sockets, directories and symbolic links.

Directories are just special files that map file names with references to files. A filename+reference association is called a link. (Consequence: moving files within a disk is fast even for huge files, it's just creating a link in a new directory, the data doesn't have to physically move).

Each directory keeps a reference to itself (.) and to the parent directory (..).

Symbolic links are specially marked files that contain the name of yet another file. System call can dereference a symbolic link. Sometimes, symbolic links are called soft links while regular links defined in directories are called hard links.

Each process has a current working directory.

Each file has a UID and GID that define the owner of the file and the group to which it belongs. This is used to define access permissions. For this purpose, UNIX divides users into three categories: user, group and others. Each category is given three permission bits: read, write and execute. The meaning of these bits for files and directories is as follows:

FilesDirectories
Read Contents may be read. Filenames may be listed.
Write Contents may be modified. Filenames may be added, removed and changed.
Execute File may be executed. Allows access to files, subject to permissions of files themselves.

File I/O Model

In UNIX, everything is a file and the same system calls (open, read, write and close) are used to perform I/O in all files.

Each open file has a non-negative integer called a file descriptor. Processes usually inherit three file descriptors from their parents: standard input (0), standard output (1), standard error (2).

Programmers usually don't call open, read, write and close directly, but instead use library calls such as fopen, fclose, scanf, printf, etc.

Programs

Programs exist as source code or compiled binaries.

Processes

A process is an instance of an executing program. For a program to run, the kernel must load its code into virtual memory, allocate space for variables and set up bookkeepping structures.

Process memory is divided into four segments: text (instructions of the program), data (static variables), heap (dynamic memory) and stack (function calls).

A process can be created using the fork() system call. The parent and child processes share the text, while the data, the heap and the stack are copied. The child process can either go on executing its own functions, or calling the execve() system call to load a entirely new program. This destroys the existing text, data, heap and stack segments.

Processes have a process identifier (PID) and a parent process identifier (PPID).

Processes terminate either by using the _exit() system call (or the exit() library function) or by being killed by the delivery of a signal. A terminating process yields a termination status, a small nonnegative integer that can be inspected by the parent. By convention, a termination status of 0 indicates success.

Processes have a real user ID and real group ID, indicating the owners of the process. These are inherited from the parent process. Besides, the effective user ID and effective group ID, which are used to determine permissions (sidenote: sudo works by changing these effective IDs). Finally, supplementary group IDs indicate additional groups to which a process belongs.

Privileged processes have effective user ID equal to 0. Since kernel 2.2, superuser privileges are divided into different capabilities. A process can be granted a subset of capabilities.

The special init process is created on boot. It has PID 1 and it is the parent of all processes on the system.

Daemons are long lived processes that run on background.

Each process has an environment list (a set of keys and values) which are inherited on fork(). It can also be replaced via exec(). The environment can be read from the external variable char** environ (depending of the platform, this can be passed as the third parameter of main).

Processes are also given limits on the resources they can use (open files, memory, CPU). This is done calling setrlimit(). These values are inherited when fork()'ing. Resource limits in the shell are set using ulimit.

Memory Mappings

A file can be mapped to a process' virtual memory.

Mappings can be shared with mappings in other processes if they map to the same file of if one process inherits the mapping from its parent.

Mappings can be private (changes not visible) or shared.

Mappings are useful to store the text segment of a process, file I/O and interprocess communication.

Static and Shared Libraries

Static: linked into binary. Takes up memory. Shared: one copy of the library for all.

Interprocess Communication and Synchronization

Many mechanisms: signals, pipes, sockets, file locking (file regions can be locked to prevent other processes from reading or writing), message queues, semaphores, shared memory.

Signals

Signals inform a process that an event has occurred.

Signals are sent by the kernel, by another process or the process itself.

The kill command and kill() functions are used to send signals.

A process can ignore a signal, be killed by the signal or be suspended until it receives a special signal. It can also establish a signal handler.

A process in the process' signal mask gets blocked.

Threads

A process can have multiple threads of execution, all of which share the same data area and heap, but with different stacks.

Threads can communicate via global variables and synchronize using condition variables and mutexes.

Process Groups and Shell Job Control

Sessions, Controlling Terminals, Controlling Processes

Pseudoterminals

Date and Time

Client-Server Architecture

Realtime

The /proc File System

Chapter 3: System Programming Concepts

Chapter 4: File I/O: the Universal I/O Model

Everything in UNIX is a file.

It is possible to create file holes by seeking past the end of a file and then writing bytes. The gap is called a file hole and conceptually it is filled by null bytes, even though it doesn't take any disk space.