SDI OS 06 In lack of a better name... Contents: 1. Introduction 2. Installation 3. Features 4. Operating System 5. Userspace Libraries 6. Contact ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Introduction This is a toy operating system developed during the System Design and Implementation course 2006 at the University of Karlsruhe. It was designed and written by Timo Bingmann, Matthias Braun, Torsten Geiger and Andreas Maehler. 1.1 Code/Libraries used This project wouldn't have been possible without code from the following projects that was available under open source licenses: * L4 * sdios base * dietlibc * zlib * libpng * libjpeg * SDL * SDL_image * supertux * sdljump 1.2 Development Tools Such a project would also not be possible without good development tools: * Linux * VMWare * All the good GNU development tools * subversion * A mediawiki wiki * Eclipse CDT, xemacs, vim ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2. Installation 2.1 Prerequesites * The usual suspects: gcc, make, autoconf, binutils, objdump, ... We used gcc 4.1.x for development. You should make sure that libsupc++ is in your library path! * IDL4, version 1.0.2 is required * L4 pistachio 0.4 kernel (you should consider applying the timer patch in the contrib directory to improve timing in vmware machines) 2.2 Building Use the following commands to build and install sdios in INSTALLDIR ./configure --prefix=INSTALLDIR make make install 2.3 Booting the OS You need a multiboot compliant bootloader to boot SDIos. The bootloader has to load the operating system servers and the ramdisk images as boot modules. Development happened with the grub boot loader on a floppy disk that tried tried to load the additional files from a tftp server in the network. You have to make sure that the first modules loaded are: kickstart, ia32-kernel, sigma0, root, locator. After that you can load any modules in any order. The configuration used during the presentation was the following: default 0 timeout 0 title SdiOS kernel=(nd)/tftpboot/boot/kickstart module=(nd)/tftpboot/boot/ia32-kernel module=(nd)/tftpboot/boot/sigma0 module=(nd)/tftpboot/boot/root module=(nd)/tftpboot/boot/locator module=(nd)/tftpboot/boot/pci module=(nd)/tftpboot/boot/vmwarevideo module=(nd)/tftpboot/boot/minixfs module=(nd)/tftpboot/boot/disk.img module=(nd)/tftpboot/boot/console module=(nd)/tftpboot/boot/shell ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 3. Features * 32 bit multi-tasking operating system * Protected address spaces, paging support * Text mode output to graphics card, supports a subset of ANSI escape sequences * Virtual Consoles with scrollback support * Input from keyboard * Ramdisk support (loaded as grub modules) * Minix filesystem * Uniform global namespace (used for tasks, services, filesystem) * Elf loading * PCI support * VMWare graphics card framebuffer driver * Supports a subset of the POSIX API * Ports of several gaming related libraries: SDL, zlib, jpeg6, png, SDL_image * Command line tools: shell, cat, ls * Ports of two games: supertux, sdljump 3.1 Not working yet * Write access crashes the minixfs server * The console sometimes stops working properly and only scrolls the last line * ThreadIDs are not managed (we just assign new IDs and wrap around if too many have been assigned) 3.2 Could be improved * libc write functions are not buffered yet, so things like fputc and esp. fprintf which uses fputc internally produce a write IPC per character. This is inefficient! * Heap management algorithm is only O(n) * More POSIX functions could be implemented * Console only understands a subset of ANSI escape sequences 3.3 Would be nice to have * IDE block device driver * A sound card driver * More applications :) 3.4 Not a design goal * No security * No multiuser system (but still multitasking) * No networking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4. Operating System 4.0 The boot process / roottask First all the code of the roottask and all loaded boot modules is pinned in the roottask, so that the sigma0 server of L4 won't release the memory to other parties. The next step is to start the following code modules, which are all linked into the roottask binary. * Logger Thread in the address pace of the roottask * Sigma1 Pager: Started as a thread in its own address space but with roottask as backing pager. This way data/code segments are shared with the roottask, but pages mapped from sigma0 are protected from the threads in the roottask. * Syscall Server Thread in the address space of the roottask * Ramdisk Thread in the address space of the roottask * Elfexec Init thread in the address space of the roottask. Finally the roottask enters into a loop and serves pagefault requests so it can be used as pager for the sigma1 server. The rest of the bootup process is executed by the elfexec thread. It is implemented as a thread because we need a running sigma1 pager to start further tasks, but for the sigma1 pager to work the roottask must server pagefault IPCs. The elfexec thread inspects the boot modules loaded by grub. If a module is an elf-file then a new task is started (see elf loading), if it is a minix filesystem image, then the module gets registered as ramdisk. 4.1 Memory Management The pager sigma1 starts by fetching all available anonymous memory from sigma0 via the RPC protocol L4_Sigma0_GetAny. This transfers ownership of all remaining conventional memory pages to sigma1, it does not include special address ranges marked as bootloader, architecture-specific or reserved in the KIP. Afterwards the roottask cannot allocate more anonymous memory from sigma0, thus the roottask has no dynamic heap (no working malloc). Sigma1's interface contains a function GetPageGrant, which may be used by the root task (or any other task) if more memory is required. Since sigma1 itself has the same memory view as the root task, it too has no working dynamic heap. Therefore sigma1 backs it's dynamic data structures on different slab allocators. The slab allocators are a list of slab pages from which fixed-sized memory blocks can be allocated. Initially a few slab pages from memory allocated within the data segment (which is reserved by the roottask's memory pinning) are added to each slab allocator's pool. If the slab allocator requires further memory, it reserves a page from the free anonymous memory pool. Sigma1 organises free pages in a buddysystem. Free pages of any size greater than 4096 can be allocated by breaking up larger pages if required. The buddysystem will also coalesce smaller adjacent pages into larger blocks. The status of the buddysystem can be view by reading /task/freelist. Each task which managed by the sigma1 pager requires some specific variables, which are held in the TaskEntry structure. It is allocated when a new task is created and the task list enables the pager to function as task server. Each TaskEntry holds the MappingList of the managed task. It contains a sorted linked list describing current page mappings. A memory range of the managed address space is associated with an anonymous memory page of equal size, which was retrieved from sigma0. When a pagefault occurs on the managed task, the pager first looks into the current mapping list and sends a corresponding MapItem if the address is already mapped. Otherwise the pager checks in which address range the touched address lies. The address space layout can be found in src/pager/vmemlayout.txt. The fault address determines whether a new zero-fill page of anonymous memory is allocated and returned, or if the pager will panic and thus simulate a segmentation fault. Currently a segfault will stop the whole system. It can easily be changed to kill the faulting task. The pager's interface contains a brk() call to change the upper limit of the heap. This only modifies the selection between segfault and free-mem allocation. The pager sigma1 also functions as task server, because it already contains a full task list of managed address spaces. Therefore the elf-loader needs only to call CreateTask on the sigma1 pager to make a new address space and initial thread. When a task wishes to terminate (by calling exit), the task must call KillMe() on the sigma1 pager, which stops the threads execution and cleans up the address space. Because sigma1 does not run in the roottask, it must use the syscall server for the privileged calls to ThreadControl and SpaceControl. The KillMe() function takes a return code as an argument. This return code is saved in the TaskEntry and the task's status changes to zombie. A different task can then call WaitFor() with corresponding parameters to get this return code and clean up the zombie task. Linux coders will recognize the semantics of wait() and waitpid(). Many internal structures of the sigma1 pager are visible on the name space (file system) in the /task directory. 4.2 Elf Loading The elf loader is used to create a new task from an elf image. The loading function sdi_elfexec is implemented in the libsdi, so any thread can create a new task. Contrary to exec()'s semantics the sdl_elfexec does not replace the currently running task. An elf image can be started from memory or from the file system. First the sigma1 pager's function CreateTask is called to create a new empty address space. Then code and data from the elf image is loaded into the new addreess space by creating "shared" mappings. The creating thread calls sigma1's GetSharedPage function to get an area of memory which will be placed into the new task's address space at a given position. The call itself returns a MapItem (mapped to 0x90000000) which is then filled with the code/data from the elf image. FreeSharedPage releases the mapping from the creator's address space. Using multiple calls and copies the whole elf program image is transfered into the new address space. The elfexec function completely ignores rwx protection flags and the pager does not implement them. After the binary code is in place, the elfexec function constructs the stack of the new task. On top of the stack it places the environment and command line parameters in the same way as it is done on Linux. The environment variables and command line parameters are copied from the sdi_elfexec's parameter list. In the end the three calling parameters of the new thread's main() function are "pushed" onto the created stack. The stask is transfered to the new address space in the same fashion as the binary code using GetSharedPage. Finally the new task is kick-started by the pager using the StartTask function. 4.3 Heap Management 4.4 Naming System 4.5 Blockdevices Blockdevices are storage devices which are separated into several blocks. A block is a fixed size array of bytes, each device can have a different but constant blocksize. The only operations possible are reading and writing of a block (identified by its block number). Blockdevices are the base on which filesystems get implemented. Currently there is only an implementation of a ramdisk in the roottask which allows reading/writing to grub modules. Blockdevices implement the IF_BLOCKDEV interface. 4.6 Filesystems 4.7 Console 4.8 PCI driver, vmware graphics card framebuffer driver ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 5. Userspace libraries There are a set of libraries used to support user applications and servers: 5.1 libio Came with SDI and allows reading/writing to the serial port (where you typically connect a terminal or terminal emulator application like minicom) 5.2 libsdi This library contains convenience functions for working with SDI specific features. It contains functions that loading and starting elf-files, stopping the kernel in exceptional situations (panic), using the logging server, resolving namespace paths, as well as inline assembly functions for accessing ports. 5.3 libc Implements a POSIX subset to allow easy creation and porting of user applications. It features: * Heap management * Assert * Functions for working with environment variables * Nearly complete IO support (fopen, fprintf & frineds) * Some few math functions * opendir/readdir and limited stat support Some parts of the libc code (like printf, scanf, most math functions and the random number generator were copied from dietlibc). 5.4 libstdc++ A very incomplete and not really conformant implementation of the standard C++ library. Featuring a string class, a vector class and an insert only map based on a redblack tree. Basically it implements all features used by supertux :) 5.5 png Straight forward port of the libpng library 5.6 zlib Straight forward port of the zlib library 5.7 jpeg Straight forward port of the jpeg library 5.8 SDL A port of the SDL library (TODO: write more) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 6. Contact ... TODO ...