Course list http://www.c-jump.com/bcc/
Fast and reliable:
hard disk heads do not move too much when reading/writing data.
Copies of important data structures are stored in multiple places on disk.
How UNIX traverses directories and gains access to file's data?
Consider UNIX file path
/etc/passwd
It is interpreted by the system as follows:
/etc/passwd -> root directory becomes surrent inode
/etc/passwd -> directory entry "etc" is retrieved
"etc" entry found -> the "etc" inode becomes current
/etc/passwd -> directory entry "passwd" is retrieved
"passwd" entry found -> the "passwd" inode becomes current
This last inode indicates that "passwd" is a file.
file path -> uniquie inode -> file data
Files are specified by a character string known as path name, or simply file path.
Each path name uniquely specifies a file or directory.
The operating system converts path name to the unique inode.
The inode describes file data storage and its metadata.
Inode size is 128 bytes.
Inode contains the following metadata fields:
inode type: file, directory, or pipe (zero means that inode is free.)
locations of data blocks on disk
file size
ownership: user/group
access permission attributes: read/write/execute/directory search
file last access/modified timestamps
inode last modified timestamp (always changes when file is modified.)
NOTE: inode contains no information about file path!
Inode can be locked in support of exclusive right of a process to access a file.
Inode has a status indicating that pending modifications in memory require physical data save to disk.
Inode contains a list of data block numbers allocated on disk.
A data block is similar to a disk cluster on Windows.
File data blocks can be fragmented on disk.
Inode contains Table of Contents of Disk Blocks.
Each inode reserves fixed space for only 13 pointers to disk blocks:
Directory is stored as ordinary file(*).
The directory is made of fixed-size directory entry records.
Each entry contains the following fields:
inode number (zero indicates a deleted object.)
entry name, such as "etc" or "passwd"
_______________________
(*) However, only operating system can update the directory, thus insuring its correct structure.
Each directory contains entries named "." and ".."
These two indicate inode numbers of
the current directory "."
the parent directory ".."
Linux uses Ext2 and Ext3 file systems:
Based on the UNIX File System (UFS):
Ext2/3 starts with reserved area
Remainder of the file system is divided into block groups.
All block groups are the same size, except last one, whose size may be different.
File and directory content is stored in data blocks.
Data block is a group of consecutive sectors.
File and directory metadata is stored in inode
Each block group has one inode table.
Inode table is a sequence of the inodes in this group.
Each inode has fixed size.
File name is stored in a directory entry structure:
Directory entries are located in the blocks allocated to the file's parent directory.
Directory entry contains
the name of the file
pointer to the file's inode entry, that is, the inode number in the inode table.
ExtX file system-wide information is found in:
the superblock of block group 0(*)
set of group descriptors for each block group in the partition.
_____________________
(*) backup copies of superblock and group descriptors set can be found at the beginning of other block groups of the ExtX partition.
If ExtX contains bootable Linux kernel, then its partition can have boot code
(sometimes it doesn't -- see next slide.)
Boot code knows exactly which blocks are allocated to the Linux kernel.
Boot code loads kernel into memory during boot.
Non-bootable ExtX file system will have no boot code.
If exists, the boot code is in the 1,024 bytes before the superblock -- that is, in the first two sectors of the partition.
The boot code is executed after control is passed to it from the Master Boot Record (MBR) program in disk sector zero.
Many Linux systems do not have boot code in the file system with the kernel.
Instead, MBR's boot loader program knows in which disk blocks the kernel is located.
Therefore, no additional boot code in the file system partition is needed.
Located at the beginning of the file system:
1,024 bytes from the start of the file system
1,024 bytes in size (most of the bytes are not used.)
The superblock contains:
file system size
file system configuration
The superblock is similar to the boot sector (aka volume boot record) in NTFS and FAT file systems on Windows.
(However, the Linux/UNIX superblock does not have any boot code.)
Super block layout contains the following fields:
block size
file system size (the total number of blocks)
NOTE: there could be hidden data following the file system, which is called volume slack.
the number of blocks per block group
the number of reserved blocks before the first block group
number of free blocks
list of free blocks
the size of inode list (total number of inodes)
the number of inodes per block group
list of free inodes
Also, the superblock contains
volume label - can be used to identify the file system, e.g. rootfs
the last write time
the last mount time
the path where the file system was last mounted, e.g. /dev/hda5
The operating system periodically saves the superblock to disk to assure its integrity.
The first block group is located in the block following the reserved area:
Group descriptors are located in the group descriptor table.
The group descriptor table is located in the block after the superblock
Each group descriptor contains info about the corresponding block group.
NOTE: backup copies of the superblock and group descriptor tables exist throughout the file system in case the primary copies are damaged.
Super Block - 1 Block
Group Descriptor Table - n Blocks
Block Bitmap - 1 Block
Inode Bitmap - 1 Block
Inode Table - n Blocks
File Data Blocks - n Blocks
Group descriptor table is located in the block following the superblock.
Contains group descriptor entry for every block group in the file system.
Entry layout (Bytes/Description):
0–3 Starting block address of block bitmap
4–7 Starting block address of inode bitmap
8–11 Starting block address of inode table
The block bitmap manages allocation status of the blocks in the group.
Starting block address specified in the group descriptor entry.
Block bitmap size in bytes can be calculated by dividing the number of blocks in the group by eight.
The block bitmap requires exactly one block of storage on disk.
The inode bitmap -- manages allocation status of the inodes in the current block group.
Starting address of the inode bitmap is given in the group descriptor entry.
Size can be calculated by multiplying the number of inodes per group by the size of each inode, which is 128 bytes.
Linux:
includes basic set of common concepts developed earlier for UNIX;
is based on single hierarchical tree structure;
has file system represented as one single entity.
Besides ext2/3, Linux supports many different file systems:
Minix, ISO 9660, UMSDOS, NFS, SMB, HPFS.
Minix was first Linux file system.
Minix limitations:
64 MB partition size;
short file names.
ext was released in April 1992.
Removes major Minix limitations.
An elaborate extension of the Minix file system.
Maximum partition size is 2 GB.
Maximum file name size is 255 characters.
ext has no support for:
concurrent access
inode modification
data modification time stamps.
ext keeps an unsorted list of free blocks and inodes.
ext file system is heavily fragmented.
ext was soon replaced by the ext2.
ext2 was introduced in January 1993.
Extends features of ext as follows:
uses improved algorithms
greatly enhances speed
maintains additional time stamps.
Special field in the superblock keeps track of the file system status:
clean or dirty.
A dirty file system automatically scans itself for errors.
ext2 maximum file size is 4 TB
(1 terabyte is 1,024 gigabytes.)
ext2 is portable to other operating systems.
However, its major shortcoming is
risk of file system corruption when writing to ext2 and not using journaling.
ext3 -- the Third Extended Linux File System
It is a journaling version of the ext2.
More often used with newer Linux systems.
Without journal ext3 is a valid ext2 file system.
ext3 can be mounted and used as ext2
All ext2 utilities work the same on ext3.