Revision: $Revision: 1.13 $ ($Date: 2004/08/13 08:14:21 $)
This topic has a total weight of 5 points and contains the following 2 objectives:
This topic includes being able to edit the appropriate system startup
scripts to customize standard system run levels and boot processes,
interacting with run levels and creating custom initrd
images as needed.
This topic includes being able to properly configure and navigate the standard Linux filesystem, configuring and mounting various filesystem types, and manipulating filesystems to adjust for disk space requirements or device additions.
This topic includes being able to edit the appropriate system startup
scripts to customize standard system run levels and boot processes,
interacting with run levels and creating custom initrd
images as needed.
Key files, terms and utilities include:
/etc/init.d/ |
/etc/inittab |
/etc/rc.d |
| mkinitrd Described in the section called “mkinitrd” |
A description of the boot process can also be found at Vanderbilt University.
Kernel loader loading, setup, and execution
(bootsect.s)
In this step the file bootsect.s is loaded
into memory by the BIOS. bootsect.s then sets
up a few parameters and loads the rest of the kernel into memory.
Parameter setup and switch to 32-bit mode
(boot.s)
After the kernel has been loaded, boot.s
takes over. It sets up a temporary IDT and GDT (explained later
on) and handles the switch to 32-bit mode.
Detailed information on IDT, GDT and LDT can be found on sandpile.org - The world's leading source for pure technical x86 processor information.
Kernel decompression (compressed/head.s)
The kernel is stored in a compressed format. This head.s
(since there is another head.s)
decompresses the kernel.
Kernel setup (head.s)
After the kernel is decompressed, head.s (the
second one) takes over. The real GDT and IDT are created, as is a
basic memory-paging table.
Kernel and memory initialization (main.c)
This step is the most complex. The kernel now has control and sets up all remaining parameters and initializes everything remaining. Virtual memory is setup completely and the first processes are created.
Init process creation (main.c)
In the final step of booting, the Init process is created.
When the computer is first turned on, BIOS loads the boot sector of
the boot disk into memory at location 0x7C00. This first sector
corresponds to the bootsect.s file. The BIOS
will only copy 512 bytes, so the kernel loader must be small. The
code that is loaded by the BIOS must be able to load the remaining
portions of the operating system and pass control onto the next file.
The first thing that bootsect.s does when it is
loaded is to move itself to the memory location 0x9000. This is to
avoid any possible conflicts in memory. The code then jumps to the
new copy located at 0x9000. After this, an area in memory is set
aside (0x4000-12) for a new disk parameter table. To make it so that
more than one sector can be read from the disk at a time, we will
try to find the largest number of sectors that can be read at a time.
This will help speed reads from the disk when we begin loading the
rest of the kernel.
Before this is done, setup.s is loaded into memory in the memory space
above bootsect.s, 0x9020. This allows setup.s to
be jumped to after the kernel has been loaded. Now the disk parameter
table is created. Basically, the code tries to read 36 sectors, if
that fails it tries 18, 15, then if all else fails it uses 9 as the
default.
If at any point there is an error, little can be done. In most cases,
bootsect.s will just keep trying to do what it
was doing when the error occurred. Usually this will end in an
unbroken loop that can only be resolved by rebooting by hand.
At last we are ready to copy the kernel into memory.
bootsect.s goes into a loop that reads the first 508Kb from
the disk and places it into memory starting at 0x10000. After the
kernel is loaded into RAM, bootsect.s jumps to
0x9020, where setup.s is loaded.
setup.s makes sure that all hardware information
has been collected and gathers more information if necessary. It first
verifies that it is loaded at 0x9020. After this is
verified, setup.s does the following:
Gets main memory size
Sets keyboard repeat rate to the maximum
Retrieves video card information
Collects monitor information for the terminal to use
Gets information about the first and possibly second hard drive using BIOS
Looks to see if there is a mouse (a pointing device) attached to the system
All of the information that setup.s collects is
stored for later use by device drivers and other areas of the system.
Like bootsect.s, if an error occurs little can be
done. Most errors are “handled” by an infinite loop that has to be
reset manually.
The next step in the booting process needs to use virtual memory.
This can only be used on a x86 by switching from real mode to
protected mode. After all information has been gathered by
setup.s, it does a few more housekeeping chores to get
ready for the switch to 32-bit mode.
First, all interrupts are disabled. Once the system is in 32-bit mode,
no more BIOS calls can be made. The area of memory at 0x1000 is where
the BIOS handlers were loaded when the system came up. We no longer
need these, so to get the compressed kernel out of the way,
setup.s moves the kernel from 0x10000 to 0x1000. This
provides room for a temporary IDT (Interrupt Descriptor Table) and GDT
(Global Descriptor Table). The GDT is only setup to have the system in
memory. All paging is disabled, so that described memory locations
correspond to actual memory addresses. At this point, extended (or
high) memory is enabled.
Setup also resets any present coprocessor and reconfigures the 8259
Programmable Interrupt Controller. All that remains now is for the
protected bit mask to be set, and the processor is in 32-bit mode.
After the switch has been made, setup.s lets
processing continue at /compressed/head.s to
uncompress the kernel.
This first head.s uncompresses the kernel into
memory. The kernel is gzip-compressed to make sure that it can fit into the
508Kb that bootsect.s will load. When the kernel
is compiled, bootsect.s, head.s
, and /compressed/head.s are not
compressed and are appended to the front of the compressed kernel.
They are the only three files that must remain uncompressed.
head.s decompresses the kernel to address
0x1000000. This corresponds to the 1Mb boundary in memory.
head.s does a bit of error checking before it decompresses
the kernel to ensure that there is enough memory available in high
memory.
Right before the decompression is done, the flags register is reset
and the area in memory where setup.s was is
cleared. This is to put the system in a better known state. After the
decompression, control is passed to the now decompressed
head.s.
The second head.s is responsible for setting up the permanent IDT
and GDT, as well as a basic paging table. Before anything is done,
the flags register is again reset. The first page in the paging system
is setup at 0x5000. This page is filled by the information gathered
in setup.s by copying it from its location at
0x9000.
Next the processor type is determined. For 586s (Pentium) and higher there is a processor command that returns the type of processor. Unfortunately, the 386 and 486 do not have this feature so some tricks have to be employed. Each processor has only certain flags, so by trying to read and write to them you can determine the type of processor. If a coprocessor is present that is also detected.
After that, the IDT and GDT are set up. The table for the IDT is set up. Each interrupt gets an 8-byte descriptor. Each descriptor is initially set to ignore_int. This means that nothing will happen when the interrupt is called. All that ignore_int does is, is save the registers, print “unknown interrupts”, and then restore the registers.
Each IDT descriptor is divided into four two-byte sections. The top four bytes are called the WW, while the bottom four are the CW. The WW contains a two-byte offset, a P-flag set to 1, and a Descriptor Privilege Level. The CW has a selector and an offset. In total the IDT can contain up to 256 entries.
At this point the code sets up memory paging. In the x86 architecture,
virtual memory uses three descriptors to establish an address: a Page
Directory, a Page Table, and a Page Frame. The Page Directory is a
table of all of the pages and what processes they match to. The Page
Directory contains an index into the Page Table. The Page Table maps
the virtual address to the beginning of a physical page in memory. The
Page Frame and an offset use the beginning address of the physical
page and can retrieve an actual location in memory. The three
structures are setup by head.s. They make it so
that the first 4Mb of memory is in the Page Directory. The kernel's
virtual address is set to 0xC0000000, or the top of the last gigabyte of
memory.
Each memory address in an x86 has three parts. The first is the index into the Page Directory. The result of this index is the start of a specific Page Table. The second part of the 32-bit address is an offset into the Page Table. The Page Table has a 32-bit entry that corresponds to that offset. The top 20 bits are used to get an actual physical address. The lower 12 bits are used for administrative purposes. The physical address corresponds to the start of a physical page. The third part of the 32-bit address is an offset within this page, equal to a real memory location.
Almost everything is set up at this point. Now control is passed to the main function in the kernel. Main.c gains control.
All remaining setup and initialization functions are called from
main.c. Paging, the IDT and most of the other
systems have been initialized by now. Main.c will make sure
everything is in its proper place before it tries to start some
processes and give control to init.c.
A call to the function start_kernel() is made. In essence all that start_kernel() does is run through a list of init functions that needed to be called. Such things as paging, traps, IRQs, process schedules and more are setup. The important work has already been done for memory and interrupts. Now all that has to be done is to have all of the tables filled in.
After all of the init functions have been called main.c tries to
start the init process. main.c tries three
different copies of init in order. If the first
doesn't work, it tries the second, if that one doesn't work it goes to
the third. Here are the file names for the three init versions:
/etc/init
/bin/init
/sbin/init
If none of these three inits work, then the system goes into single user mode. init is needed to log in multiple users and to manage many other tasks. If it fails, then the single user mode creates a shell and the system goes from there.
init is the parent of all processes, it reads the
file /etc/inittab and creates processes based on
its contents. One of the things it usually does is spawn
gettys so that users can log in. It also defines so called
“runlevels”.
A “runlevel” is a software configuration of the system which allows only a selected group of processes to exist.
init can be in one of the following eight runlevels
Runlevel 0 is used to halt the system.
Runlevels 7, 8 and 9 are also valid.
Most of the Unix variants don't use these runlevels. On
a particular Debian
Linux System for instance, the /etc/rc<runlevel>.d
directories, which we'll discuss later, are not
implemented for these runlevels, but they could be.
Runlevels s and S are internally the same runlevel S which brings
the system in “single-user mode”. The scripts in the
/etc/rcS.d directory are executed when
booting the system. Although runlevel S is not meant to be
activated by the user, it can be.
Runlevels A, B and C are so called “on demand” runlevels. If the current runlevel is “2” for instance, and an init A command is executed, the things to do for runlevel “A” are done but the actual runlevel remains “2”.
As mentioned earlier, init reads the file
/etc/inittab to determine what it should do. An
entry in this file has the following format:
id:runlevels:action:process
Included below is an example /etc/inittab file.
# The default runlevel.
id:2:initdefault:
# Boot-time system configuration/initialization script.
# This is run first except when booting in emergency (-b) mode.
si::sysinit:/etc/init.d/rcS
# What to do in single-user mode.
~~:S:wait:/sbin/sulogin
# /etc/init.d executes the S and K scripts upon change
# of runlevel.
#
# Runlevel 0 is halt.
# Runlevel 1 is single-user.
# Runlevels 2-5 are multi-user.
# Runlevel 6 is reboot.
l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l4:4:wait:/etc/init.d/rc 4
l5:5:wait:/etc/init.d/rc 5
l6:6:wait:/etc/init.d/rc 6
# Normally not reached, but fall through in case of emergency.
z6:6:respawn:/sbin/sulogin
# /sbin/getty invocations for the runlevels.
#
# The "id" field MUST be the same as the last
# characters of the device (after "tty").
#
# Format:
# <id>:<runlevels>:<action>:<process>
1:2345:respawn:/sbin/getty 38400 tty1
2:23:respawn:/sbin/getty 38400 tty2
Description of an entry in /etc/inittab:
The id-field uniquely identifies an entry in the file
/etc/inittab and can be 1-4 characters in length. For
gettys and other login processes however, the id field should
contain the suffix of the corresponding tty, otherwise the login
accounting might not work.
This field contains the runlevels for which the specified action should be taken.
The “action” field can have one of the following values:
The process will be restarted whenever it terminates, (e.g. getty).
The process will be started once when the specified runlevel is entered and init will wait for its termination.
The process will be executed once when the specified runlevel is entered.
The process will be executed during system boot. The runlevels field is ignored.
The process will be executed during system boot, while
init waits for its termination (e.g.
/etc/rc). The runlevels field is
ignored.
A process marked with an on demand runlevel will be executed whenever the specified ondemand runlevel is called. However, no runlevel change will occur (on demand runlevels are “a”, “b”, and “c”).
An initdefault entry specifies the runlevel which should be entered after system boot. If none exists, init will ask for a runlevel on the console. The process field is ignored. In the example above, the system will go to runlevel 2 after boot.
The process will be executed during system boot. It will be executed before any boot or bootwait entries. The runlevels field is ignored.
The process will be executed when the power goes down. init is usually informed about this by a process talking to a UPS connected to the computer. init will wait for the process to finish before continuing.
As for powerwait, except that init does not wait for the process's completion.
This process will be executed as soon as init is informed that the power has been restored.
This process will be executed when init is told that the battery of the external UPS is almost empty and the power is failing (provided that the external UPS and the monitoring process are able to detect this condition).
The process will be executed when init receives the SIGINT signal. This means that someone on the system console has pressed the CTRL-ALT-DEL key combination. Typically one wants to execute some sort of shutdown either to get into single-user level or to reboot the machine.
The process will be executed when init receives a signal
from the keyboard handler that a special key combination
was pressed on the console keyboard. Basically you want to
map some keyboard combination to the “KeyboardSignal”
action. For example, to map Alt-Uparrow for this purpose
use the following in your keymaps file:
alt keycode 103 = KeyboardSignal.
This field specifies the process that should be executed. If the
process field starts with a “+”, init will
not do utmp and wtmp
accounting. Some gettys insist on doing their own housekeeping.
This is also a historic bug.
For each of the runlevels 0-6 there is an entry in
/etc/inittab that executes /etc/init.d/rc ?
where “?” is 0-6, as you can see in following line from
the earlier example above:
l2:2:wait:/etc/init.d/rc 2
So, what actually happens is that /etc/init.d/rc is called with the runlevel as a parameter.
The directory /etc contains several, runlevel
specific, directories which in their turn contain runlevel specific
symbolic links to scripts in /etc/init.d/. Those
directories are:
$ ls -d /etc/rc*
/etc/rc.boot /etc/rc1.d /etc/rc3.d /etc/rc5.d /etc/rcS.d
/etc/rc0.d /etc/rc2.d /etc/rc4.d /etc/rc6.d
As you can see, there also is a /etc/rc.boot
directory. This directory is obsolete and has been replaced by the
directory /etc/rcS.d. At boot time, the directory
/etc/rcS.d is scanned first and then, for
backwards compatibility, the /etc/rc.boot.
The name of the symbolic link either starts with an “S” or with a “K”.
Let's examine the /etc/rc2.d directory:
$ ls /etc/rc2.d
K20gpm S11pcmcia S20logoutd S20ssh S89cron
S10ipchains S12kerneld S20lpd S20xfs S91apache
S10sysklogd S14ppp S20makedev S22ntpdate S99gdm
S11klogd S20inetd S20mysql S89atd S99rmnologin
If the name of the symbolic link starts with a “K”, the script is called
with “stop” as a parameter to stop the process. This is the case for
K20gpm, so the command becomes K20gpm
stop. Let's find out what program or script is called:
$ ls -l /etc/rc2.d/K20gpm
lrwxrwxrwx 1 root root 13 Mar 23 2001 /etc/rc2.d/K20gpm -> ../init.d/gpm
So, K20gpm stop results in /etc/init.d/gpm stop. Let's see what happens with the “stop” parameter by examining part of the script:
#!/bin/sh
#
# Start Mouse event server
...
case "$1" in
start)
gpm_start
;;
stop)
gpm_stop
;;
force-reload|restart)
gpm_stop
sleep 3
gpm_start
;;
*)
echo "Usage: /etc/init.d/gpm {start|stop|restart|force-reload}"
exit 1
esac
In the case..esac the first parameter, $1, is examined and in case its value is “stop”, gpm_stop is executed.
On the other hand, if the name of the symbolic link starts with an “S”, the script is called with “start” as a parameter to start the process.
The scripts are executed in a lexical sort order of the filenames.
Let's say we've got a daemon SomeDaemon, an
accompanying script /etc/init.d/SDscript and we
want SomeDaemon to be running when the system is
in runlevel 2 but not when the system is in runlevel 3.
As you've read earlier, this means that we need a symbolic link, starting with an “S”, for runlevel 2 and a symbolic link, starting with a “K”, for runlevel 3. We've also determined that the daemon SomeDaemon is to be started after S19someotherdaemon which implicates S20 and K80 since starting/stopping is symmetrical, i.e. that what is started first is stopped last. This is accomplished with the following set of commands:
# cd /etc/rc2.d
# ln -s ../init.d/SDscript S20SomeDaemon
# cd /etc/rc3.d
# ln -s ../init.d/SDscript K80SomeDaemon
Should you wish to manually start, restart or stop a process, it is
good practice to use the appropriate script in /etc/init.d/
, e.g. /etc/init.d/gpm restart to
initiate the restart of the process.