Chapter 2. System Startup (202)

This topic has a total weight of 8 points and contains the following 2 objectives:

Objective 202.1 Customizing system startup and boot processes (4 points)

This topic includes being able to edit the appropriate system startup scripts to customize standard system run levels and boot processes, interacting with run levels. Some knowledge about the LSB (the Linux Standard Base Specification) is required. For historical reasons a section about creating custom initrd images is included.

Objective 202.2 System recovery (4 points)

This topic includes being able to properly configure and navigate the standard Linux filesystem, configuring and mounting various filesystem types, and manipulating filesystems to adjust for disk space requirements or device additions.

Customizing system startup and boot processes (202.1)

This topic includes being able to edit the appropriate system startup scripts to customize standard system run levels and boot processes, interacting with run levels. Some knowledge about the LSB (the Linux Standard Base Specification) is required. For historical reasons a section about creating custom initrd images is included.

Key files, terms and utilities include:

chkconfig
update-rc.d
/etc/init.d/
/etc/inittab
/etc/rc.d
mkinitrd Described in the section called “The initial ram disk (initrd)”

The Linux Boot process

The Linux boot process can be logically divided into seven parts. They are as follows:

  1. Kernel loader loading, setup and execution

  2. Register setup

  3. Kernel decompression

  4. Kernel and memory initialization

  5. Kernel setup

  6. Enabling of remaining CPU's

  7. Init process creation

The boot process is described in detail at Gustavo Duarte's "The Kernel Boot Process"

What's important to understand is that the kernel's init_post() function is the final step in the boot process. It tries to execute the first user-mode process in the following order:

  1. /sbin/init

  2. /etc/init

  3. /bin/init

  4. /bin/sh

If none of these succeed, the kernel will panic.

What happens next, what does /sbin/init do?

init is the parent of all processes, it reads the file /etc/inittab and creates processes based on its contents. One of the things it usually does is spawn gettys so that users can log in. It also defines so called runlevels.

A runlevel is a software configuration of the system which allows only a selected group of processes to exist.

init can be in one of the following eight runlevels

runlevel 0 (reserved)

Runlevel 0 is used to halt the system.

runlevel 1 (reserved)

Runlevel 1 is used to get the system in single user mode.

runlevel 2-5

Runlevels 2,3,4 and 5 are multi-user runlevels.

runlevel 6

Runlevel 6 is used to reboot the system.

runlevel 7-9

Runlevels 7, 8 and 9 are also valid. Most of the Unix variants don't use these runlevels. On a particular Debian Linux System for instance, the /etc/rc<runlevel>.d directories, which we'll discuss later, are not implemented for these runlevels, but they could be.

runlevel s or S

Runlevels s and S are internally the same runlevel S which brings the system in single-user mode. The scripts in the /etc/rcS.d directory are executed when booting the system. Although runlevel S is not meant to be activated by the user, it can be.

runlevels A, B and C

Runlevels A, B and C are so called on demand runlevels. If the current runlevel is 2 for instance, and an init A command is executed, the things to do for runlevel A are done but the actual runlevel remains 2.

Configuring /etc/inittab

As mentioned earlier, init reads the file /etc/inittab to determine what it should do. An entry in this file has the following format:

id:runlevels:action:process

Included below is an example /etc/inittab file.

# The default runlevel.
id:2:initdefault:

# Boot-time system configuration/initialization script.
# This is run first except when booting in emergency (-b) mode.
si::sysinit:/etc/init.d/rcS

# What to do in single-user mode.
~~:S:wait:/sbin/sulogin

# /etc/init.d executes the S and K scripts upon change
# of runlevel.
#
# Runlevel 0 is halt.
# Runlevel 1 is single-user.
# Runlevels 2-5 are multi-user.
# Runlevel 6 is reboot.

l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
# Normally not reached, but fall through in case of emergency.
z6:6:respawn:/sbin/sulogin

# /sbin/getty invocations for the runlevels.
#
# The "id" field MUST be the same as the last
# characters of the device (after "tty").
#
# Format:
#  <id>:<runlevels>:<action>:<process>
1:2345:respawn:/sbin/getty 38400 tty1
2:23:respawn:/sbin/getty 38400 tty2
          

Description of an entry in /etc/inittab:

id

The id-field uniquely identifies an entry in the file /etc/inittab and can be 1-4 characters in length. For gettys and other login processes however, the id field should contain the suffix of the corresponding tty, otherwise the login accounting might not work.

runlevels

This field contains the runlevels for which the specified action should be taken.

action

The action field can have one of the following values:

respawn

The process will be restarted whenever it terminates, (e.g. getty).

wait

The process will be started once when the specified runlevel is entered and init will wait for its termination.

once

The process will be executed once when the specified runlevel is entered.

boot

The process will be executed during system boot. The runlevels field is ignored.

bootwait

The process will be executed during system boot, while init waits for its termination (e.g. /etc/rc). The runlevels field is ignored.

off

This does absolutely nothing.

ondemand

A process marked with an on demand runlevel will be executed whenever the specified ondemand runlevel is called. However, no runlevel change will occur (on demand runlevels are a, b, and c).

initdefault

An initdefault entry specifies the runlevel which should be entered after system boot. If none exists, init will ask for a runlevel on the console. The process field is ignored. In the example above, the system will go to runlevel 2 after boot.

sysinit

The process will be executed during system boot. It will be executed before any boot or bootwait entries. The runlevels field is ignored.

powerwait

The process will be executed when the power goes down. init is usually informed about this by a process talking to a UPS connected to the computer. init will wait for the process to finish before continuing.

powerfail

As for powerwait, except that init does not wait for the process's completion.

powerokwait

This process will be executed as soon as init is informed that the power has been restored.

powerfailnow

This process will be executed when init is told that the battery of the external UPS is almost empty and the power is failing (provided that the external UPS and the monitoring process are able to detect this condition).

ctrlaltdel

The process will be executed when init receives the SIGINT signal. This means that someone on the system console has pressed the CTRL-ALT-DEL key combination. Typically one wants to execute some sort of shutdown either to get into single-user level or to reboot the machine.

kbdrequest

The process will be executed when init receives a signal from the keyboard handler that a special key combination was pressed on the console keyboard. Basically you want to map some keyboard combination to the KeyboardSignal action. For example, to map Alt-Uparrow for this purpose use the following in your keymaps file: alt keycode 103 = KeyboardSignal.

process

This field specifies the process that should be executed. If the process field starts with a +, init will not do utmp and wtmp accounting. Some gettys insist on doing their own housekeeping. This is also a historic bug.

The /etc/init.d/rc script

For each of the runlevels 0-6 there is an entry in /etc/inittab that executes /etc/init.d/rc ? where ? is 0-6, as you can see in following line from the earlier example above:

l2:2:wait:/etc/init.d/rc 2
          

So, what actually happens is that /etc/init.d/rc is called with the runlevel as a parameter.

The directory /etc contains several, runlevel specific, directories which in their turn contain runlevel specific symbolic links to scripts in /etc/init.d/. Those directories are:

$ ls -d /etc/rc*
/etc/rc.boot  /etc/rc1.d  /etc/rc3.d  /etc/rc5.d  /etc/rcS.d
/etc/rc0.d    /etc/rc2.d  /etc/rc4.d  /etc/rc6.d
          

As you can see, there also is a /etc/rc.boot directory. This directory is obsolete and has been replaced by the directory /etc/rcS.d. At boot time, the directory /etc/rcS.d is scanned first and then, for backwards compatibility, the /etc/rc.boot.

The name of the symbolic link either starts with an S or with a K. Let's examine the /etc/rc2.d directory:

$ ls /etc/rc2.d
K20gpm       S11pcmcia   S20logoutd  S20ssh      S89cron
S10ipchains  S12kerneld  S20lpd      S20xfs      S91apache
S10sysklogd  S14ppp      S20makedev  S22ntpdate  S99gdm
S11klogd     S20inetd    S20mysql    S89atd      S99rmnologin
          

If the name of the symbolic link starts with a K, the script is called with stop as a parameter to stop the process. This is the case for K20gpm, so the command becomes K20gpm stop. Let's find out what program or script is called:

$ ls -l /etc/rc2.d/K20gpm
lrwxrwxrwx 1 root root 13 Mar 23 2001 /etc/rc2.d/K20gpm -> ../init.d/gpm
          

So, K20gpm stop results in /etc/init.d/gpm stop. Let's see what happens with the stop parameter by examining part of the script:

#!/bin/sh
#
# Start Mouse event server
...
case "$1" in
  start)
     gpm_start
     ;;
  stop)
     gpm_stop
     ;;
  force-reload|restart)
     gpm_stop
     sleep 3
     gpm_start
     ;;
  *)
     echo "Usage: /etc/init.d/gpm {start|stop|restart|force-reload}"
     exit 1
esac
          

In the case..esac the first parameter, $1, is examined and in case its value is stop, gpm_stop is executed.

On the other hand, if the name of the symbolic link starts with an S, the script is called with start as a parameter to start the process.

The scripts are executed in a lexical sort order of the filenames.

Let's say we've got a daemon SomeDaemon, an accompanying script /etc/init.d/SDscript and we want SomeDaemon to be running when the system is in runlevel 2 but not when the system is in runlevel 3.

As you've read earlier, this means that we need a symbolic link, starting with an S, for runlevel 2 and a symbolic link, starting with a K, for runlevel 3. We've also determined that the daemon SomeDaemon is to be started after S19someotherdaemon which implicates S20 and K80 since starting/stopping is symmetrical, i.e. that what is started first is stopped last. This is accomplished with the following set of commands:

# cd /etc/rc2.d
# ln -s ../init.d/SDscript S20SomeDaemon
# cd /etc/rc3.d
# ln -s ../init.d/SDscript K80SomeDaemon
          

Should you wish to manually start, restart or stop a process, it is good practice to use the appropriate script in /etc/init.d/ , e.g. /etc/init.d/gpm restart to initiate the restart of the process.

update-rc.d

Debian derived Linux distributions use the update-rc.d command to install and remove the init script links mentioned in the previous section.

If you have a startup script called 'foobar' in /etc/init.d/ and want to add it to the default runlevels, you can use:

# update-rc.d foobar defaults
 Adding system startup for /etc/init.d/foobar ...
   /etc/rc0.d/K20foobar -> ../init.d/foobar
   /etc/rc1.d/K20foobar -> ../init.d/foobar
   /etc/rc6.d/K20foobar -> ../init.d/foobar
   /etc/rc2.d/S20foobar -> ../init.d/foobar
   /etc/rc3.d/S20foobar -> ../init.d/foobar
   /etc/rc4.d/S20foobar -> ../init.d/foobar
   /etc/rc5.d/S20foobar -> ../init.d/foobar
	

As you can see, update-rc.d will create K (stop) links in rc0.d, rc1.d and rc6.d, and S (start) links in rc2.d, rc3.d, rc4.d and rc5.d.

In some cases, you want to keep a package such as dovecot installed, but don't want it to start at system boot. In this case you can use update-rc.d to disable the service by removing the startup links:

# update-rc.d -f dovecot remove
 Removing any system startup links for /etc/init.d/dovecot ...
   /etc/rc2.d/S24dovecot
   /etc/rc3.d/S24dovecot
   /etc/rc4.d/S24dovecot
   /etc/rc5.d/S24dovecot
	

In this case, update-rc.d will remove the links to the startup scripts. The -f (force) option is required if the rc script still exists. If you install an updated dovecot package, the links will be restored. If you do not want this, you have to create 'stop' links in the startup runlevel directories:

# update-rc.d -f dovecot stop 24 2 3 4 5 .
 Adding system startup for /etc/init.d/dovecot ...
   /etc/rc2.d/K24dovecot -> ../init.d/dovecot
   /etc/rc3.d/K24dovecot -> ../init.d/dovecot
   /etc/rc4.d/K24dovecot -> ../init.d/dovecot
   /etc/rc5.d/K24dovecot -> ../init.d/dovecot
        

Note

Don't forget the trailing . (dot).

The LSB standard

The Linux Standard Base (LSB) defines an interface for application programs that are compiled and packaged for LSB-conforming implementations. Hence, a program which was compiled in an LSB compatible environment will run on any distribution that supports the LSB standard. LSB compatible programs can rely on the availability of certain standard libraries. The standard also includes a list of mandatory utilities and scripts which define an environment suitable for installation of LSB-compatible binaries.

The specification includes processor architecture specific information. This implies that the LSB is a family of specifications, rather than a single one. In other words: if your LSB compatible binary was compiled for an Intel based system, it will not run on, for example, an Alpha based LSB compatible system, but will install and run on any Intel based LSB compatible system. The LSB specifications therefore consist of a common and an architecture-specific part; "LSB-generic" or "generic LSB" and "LSB-arch" or "archLSB".

The LSB standard lists which generic libraries should be available, e.g. libdl, libcrypt, libpthread and so on, and provides a list of processor specific libraries, like libc and libm. The standard also lists searchpaths for these libraries, their names and format (ELF). Another section handles the way dynamic linking should be implemented. For each standard library a list of functions is given, and data definitions and accompanying header files are listed.

The LSB defines a list of 130+ commands that should be available on an LSB compatible system, and their calling conventions and behaviour. Some examples are cp, tar, kill and gzip.

The expected behaviour of an LSB compatible system during system initialization is part of the LSB specification. So is a definition of the cron system, and are actions, functions and location of the init scripts. Any LSB compliant init script should be able to handle the following options: start, stop, restart, force-reload and status. The reload and try-restart options are optional. The standard also lists the definitions for runlevels and listings of user- and groupnames and their corresponding UID's/GID's.

Though it is possible to install an LSB compatible program without the use of a package manager (by applying a script that contains only LSB compliant commands), the LSB specification contains a description for software packages and their naming conventions.

Note

LSB employs the Red Hat Package Manager standard. Debian based LSB compatible distributions may read RPM packages by using the alien command.

The LSB standards frequently refers to other well known standards, for example ISO POSIX 2003. Also, any LSB conforming implementation needs to provide the mandatory portions of the file system hierarchy as specified in the Filesystem Hierarchy Standard (FHS) , and a number of LSB specific requirements.

The bootscript environment and commands

Initially, Linux contained only a limited set of services and had a very simple boot environment. As Linux aged and the number of services in a distribution grew, the number of initscripts grew accordingly. After a while a set of standards emerged. Init scripts would routinely include some other script, which contained functions to start, stop and verify a process.

The LSB standard lists a number of functions that should be made available for runlevel scripts. These functions should be listed in files in the directory /lib/lsb/init-functions and need to implement (at least) the following functions:

  1. start_daemon [-f] [-n nicelevel] [-p pidfile] pathname [args...]

    runs the specified program as a daemon. The start_daemon function shall check if the program is already running. If so, it shall not start another copy of the daemon unless the -f option is given. The -n option specifies a nice level.

  2. killproc [-p pidfile] pathname [signal]

    shall stop the specified program, trying to terminate it using the specified signal first. It that fails, the SIGTERM signal will be sent. If a program has been terminated, the pidfile should be removed if the terminated process has not already done so.

  3. pidofproc [-p pidfile] pathname

    returns one or more process identifiers for a particular daemon, as specified by the pathname. Multiple process identifiers are separated by a single space.

In some cases, these functions are provided as stand-alone commands and the scripts simply assure that the path to these scripts is set properly. Often some logging functions and function to display status lines are also included.

Changing and configuring runlevels

Changing runlevels on a running machine requires comparison of the services running in the current runlevel with those that need to run in the new runlevel. Subsequently probably some processes need to be stopped and others to be started.

Recall that the initscripts for a runlevel X are grouped in directory /etc/rc.d/rcX.d (or, on newer (LSB based) systems, in /etc/init.d/rcX.d). Their names there determine how the scripts are called: if the name starts with a K, the script will be run with the stop, if the name starts with a S, the script will be run with the stop option. The normal procedure during a runlevel change is to stop the superfluous processes first and then start the new ones.

The actual init scripts are located in /etc/init.d. The files you find in in the rcX.d directory are symbolic links which link to these. In many cases, the start- and stop-scripts are symbolic links to the same script. This implies that such init scripts should be able to handle at least the start and stop options.

For example, the symbolic link named S06syslog in /etc/init.d/rc3.d might point to the script /etc/init.d/syslog, as may the symbolic link found in /etc/init.d/rc2.d, named K17syslog.

The order in which services are stopped or started can be of great importance. Some services may be started simultaneously, others need to start in a strict order. For example your network needs to be up before you can start the httpd. The order is determined by the names of the symbolic links. The naming conventions dictate that the names of init scripts (the ones found in the rcN.d directories) include two digits, just after the initial letter. They are executed in alphabetical order.

In the early days system administrators created these links by hand. Later most Linux distributors decided to provide Linux commands/scripts which allow the administrator to disable or enable certain scripts in certain runlevels and to check which systems (commands) would be started in which runlevel. These commands typically will manage both the aforementioned links and will name these in such a way that the scripts are run in the proper order.

The chkconfig command

Another tool to manage the proper linking of start up (init) scripts is chckconfig. On some systems (e.g. SuSE/Novell) it serves as a front-end for insserv and uses the LSB standardized comment block to maintain its administration. On older systems it maintains its own special comment section, that has a much simpler and less flexible syntax. This older syntax consists of two lines, one of them is a description of the service, it starts with the keyword description:). The other line starts with the keyword chkconfig:, and lists the run levels for which to start the service and the priority (which determines in what order the scripts will be run while changing runlevels). For example:

# Init script for foo daemon
#
# description: food, the foo daemon
# chkconfig: 2345 55 25
#
#

This denotes that the foo daemon will start in runlevels 2, 3, 4 and 5, will have priority 55 in the queue of initscripts that are run during startup and priority 25 in the queue of initscripts that are run if the daemon needs to be stopped.

The chkconfig utility can be used to list which services will be started in which runlevels, to add or delete a service to or from a runlevel and to add or delete an entire service from the startup scripts.

Note

We are providing some examples here, but be warned: there are various versions of chkconfig around. Please read the manual pages for the chkconfig command on your distribution first.

chkconfig does not automatically disable or enable a service immediately, but simply changes the symbolic links. If the cron daemon is running and you are on a Red Hat based system which is running in runlevel 2, the command

 
# chkconfig --levels 2345 crond off

would change the administration but would not stop the cron daemon immediately. Also note that on a Red Hat system it is possible to specify more than one runlevel, as we did in our previous example. On Novell/SuSE systems, you may use:

 
# chkconfig food 2345

and to change this so it only will run in runlevel 1 simply use

 
# chkconfig food 1

# chkconfig --list 

will list the current status of services and the runlevels in which they are active. For example, the following two lines may be part of the output:

xdm                       0:off   1:off   2:off   3:off   4:off   5:on    6:off   
xfs                       0:off   1:off   2:off   3:off   4:off   5:off   6:off   

They indicate that the xfs service is not started in any runlevel and the xdm service only will be started while switching to runlevel 5.

To add a new service, let's say the foo daemon, we create a new init script and name it after the service, in this case we might use food. This script is consequently put into the /etc/init.d directory, we need to insert the proper header in that script (either the old chkconfig header, or the newer LSB compliant header) and run

# chkconfig --add food 

To remove the foo service from all runlevels, you may type:

# chkconfig --del food

Note, that the food script will remain in the /etc/init.d/ directory.

The startproc command

Note

startproc is no longer an exam objective.

This command has identical functionality as the LSB command start_daemon. It starts a process if it is not already running. One of its arguments is the full path to an executable. startproc will check if there already is a process running from that executable. If not, it will start a new process by executing the executable. There are options to set the user- and groupname (-u, -g) under which the new proces should be run, the nice (-n) level and the -f option to force execution even if there seems to be another process running from the same executable.

By default, the command checks for a pidfile /var/run/<basename>.pid, but that name can be overridden by specifying the -p option. If such a pidfile exists, it is read and the command limits itself to verification of the process with that pid. If no matching process could be found, the process will be started. It is up to that process to create, delete or update the pidfile.

startproc will refuse to start up a process (unless forced by setting the -f flag) if it finds a matching zombie process in the process table.

The checkproc command

Note

checkproc is no longer an exam objective.

checkproc checks if a process is running. One of its arguments is the full path to an executable. checkproc will check if there already is a process running from that executable. If so it exits with exitcode 0. If not, it will return nothing and exit with exit code 1. If the -v flag has been specified, the command will report the pids of any matching running processes it finds.

By default, the command checks for a pidfile /var/run/<basename>.pid, but that name can be overridden by specifying the -p option. If such a pidfile exists, it is read and the command limits itself to verification of the process with that pid.

The killproc command

Note

killproc is no longer an exam objective.

killproc is used to signal running process(es). This is mostly used to stop these processes. It allows the signal to send to be specified but defaults to SIGTERM. If SIGTERM is used and the process does not terminate within five seconds a SIGKILL is sent to it. The time between sending these signals can be tuned using the -t option. If a process has been terminated and a verified pid file was found, this pid file will be removed. killproc will try to prevent killing itself, its parents or grandparents.

Copyright Snow B.V. The Netherlands