Kernel Secrets

ArticleCategory: [Choose a category for your article]

KernelCorner

AuthorImage:[Here we need a little image from you]

TranslationInfo:[Author and translation history]

original in es Emiliano Ariel Lesende

es to en Gonzalo Garcia Agullo

AboutTheAuthor:[A small biography about the author]

Abstract:[Here you write a little summary]

This articles is a short description of the Linux Kernel.

ArticleIllustration:[This is the title picture for your article]

ArticleBody:[The article body]

Presentation

Welcome to the first of a series of articles about the Linux kernel secrets. Probably you already took a look at the kernel sources some time in the past. In that case you noticed that the initial couple of 100-kb compressed files has turned into more than 300 files containing more than 2 million source code lines, and taking as many as 9 Megabytes of compressed storage.

This series is intended not for newbies but advanced programmers. Obviously you're free to read it anyway, and the author will do his best to answer any question or doubt you send through e-mail.

New bugs are discovered and new patches are published mostly every day. Nowadays it's mostly impossible to understand the source code in a whole. It's co-written by lots of different programmers who try to keep an homogeneous coding style, but in fact it differs from each other.

Linux: The Internet Operating System

Linux is a freely distributable operating system for PC architecture and others. It's compatible with the POSIX 1003.1 standard and includes a large number of features from Unix System V and BSD 4.3. Many substantial parts of the Linux kernel this series is writing about, were written by Linus Torvalds, a Finish computer science student. The first kernel was released on November, 1991.

Main Features

Linux solves mostly all needs of a current Unix-based operating system:

Compiling the Kernel

Let's take a look at the kernel source code before studying the kernel itself.

Source tree structure: Linux kernel sources are commonly located under the /usr/src/linux directory, so we'll mention directories as relative to this location. As a result of the porting to non-Intel architectures, the kernel tree was changed after version 1.0. Architecture-dependent code is located under the arch/ hierarchy. Code for Intel 386, 486, Pentium and Pentium Pro processors are under arch/i386. The arch/mips directory is for MIPS-based systems, arch/sparc for Sun Sparc-based platforms, arch/ppc for PowerPC/Powermacintosh systems, and so on. We'll concentrate on the Intel architecture as this is the most widely used with Linux.

The Linux kernel is just an standard C program. There are only two important differences. The starting point for programs written in the C language is the main(int argc,char **argv) routine. Linux kernel uses start_kernel(void). The program environment does not exist yet when the system is starting up and the kernel is to be loaded. This means that a couple of things are to be done before the first C routine is called. The asembler code that perform this task is located under the arch/i386/asm/ directory.

The appropiate assembler routine loads the kernel into the absolute 0x100000 (1 Mbyte) memory address, then installs the interrupt servicing routines, global file descriptor tables and interrupt descriptor tables, that are exclusively used during the initialization process. At this point, the processor is turned into protected mode. The init/ directory contains everything you need to initialize the kernel. Here is the start_kernel() routine, dedicated to initialize the kernel properly, taking in consideration all passed boot parameters. The first process is created without using system calls (system itself is not loaded yet). This is the famous idle process, the one which uses processor time when not used by any other process.

The kernel/ and arch/i386/kernel/ directories contain, as suggested by their path names, the main parts of the kernel. Here is where main system calls are located. Here are implemented other tasks including the time handler, the scheduler, the DMA manager, the interrupt handler and the signal controller.

Code handling system memory is located in mm/ and arch/i386/mm/. This area is devoted to the memory assignation and release for processes. Memory paging is also implemented here.

The Virtual File System (vfs) is under the fs/ directory. Different supported file system formats are located in different subdirectories respectively. The most important file systems are Ext2 y Proc. We'll take a detailed look at later them later.

All operating systems require a set of drivers for hardware components. In the Linux kernel, these are located under drivers/.

Under ipc/ you will find the Linux implementation of the System V IPC.

Source code to implement several network protocols, sockets and internet domains is stored under net/.

Some standard C routines are implemented in lib/, enabling the kernel itself to use C programming habits.

Loadable modules generated during the kernel compilation are saved in modules/, but it's empty until the first kernel compilation is done.

Probably the most important directory used by programmers is include/. Here you find all C header files specifically used by the kernel. Specific kernel header files for intel platforms are under include/asm-386/

Compiling: A new kernel is basically generated in just three steps:

We will get on details about the backgrounds for these scripts and how to modify them to introduce new configuration options in next articles.

I hope you enjoyed this article. You're free to email your comments, sugestions and criticisms to elesende@nextwork.net.



For more information: