15 - Architect Korner.v File Management
15 - Architect Korner.v File Management
Architect Korner
File Management
Software architects have a lot to do with reading and writing to databases, files,
devices and memory locations. Sending data in and out of files efficiently and
synchronously is very crucial in this field. Files are important not only from an
operating system perspective, but also for application programming. This article
introduces you to how architects use files to build their applications, middleware
and entire software products on the Linux platform.
T
he word ‘file’ originates files. There are different types of files—namely,
from the Latin filum, which regular or ordinary files, directory files, device
means a thread to hold loose files, link files, FIFO and socket files. Device
papers, especially arranged files are of two types: character and block
for reference. But in computing science, the device files. There are two types of link files:
word ‘file’ is used as a collection of data stored soft and hard [1] link files. In Linux, these files
with one name, which is the file name. Linux are arranged in an inverted tree data structure,
treats everything as a file. As an example, for easy maintenance.
data, devices, inter-process communication How are these files stored on a storage
mechanisms, and links are all dealt with as medium like a hard disk, USB storage or
Process 2
Hard Disk number is used to identify the device type, and the minor
0
number is used to identify the instance of that device type. In
e2
1 Inode Number
Fil
2
3 this case, the inode table points to a switch table.
4 File 1 There are two switch tables in Linux—character and
Process 3
0
Data Blocks block device switch tables. Using the major number, the
1
2 File
3 switch table routes to a corresponding device driver, and the
3
…. device driver accesses the device. The complete flow of a user
Per Process Table Global File Table Global Inode Table process accessing a device is shown in Figure 6.
Figure 5: Interaction between processes and VFS objects (normal files) Linux system calls and strace
Linux has approximately 338 system calls. The system call
table is stored in unistd.h (/usr/src/linux-2.6.34/arch/x86/
inode Table Switch Table include/asm/unistd_32.h), and each system call is identified
by a unique system-call number. The file- and file-system-
related system calls are given in Table 2, with their call
File Table Device Driver numbers and a brief description of each call.
File system related system calls are available in Table 2
carried in the LFY CD bundled with this issue.
Let's say you want to trace a system call. You want
File descriptor Device
Device to find out which system calls an executable program
Table Access a Device Controller uses, how many times each system call is invoked, and
<< *** >>
how much time it takes for each invocation. We can get
#include <header.h>
main()
{ int I;
…… .
………… .....
………………
open( “device”, O_RDWR);
all this information from the strace command-line utility.
}
…………… .
For example, assume I want to create a named pipe, i.e., a
FIFO file, to communicate between two processes. To do
this, I'd run the mkfifo command. We can execute mkfifo
Figure 6: A user process accesses a device-special file via strace to obtain system call coverage information, as
shown in Figures 7 and 8.
Accessing normal/ordinary files Figure 8 shows the execution of strace –c –F mkfifo, and
Linux uses system calls to access a file. Let's begin with lists the information in a different format. Using this, we
the open system call, which is used to open a file, and can see which system calls were used by mkfifo, how many
which returns a file descriptor for access to the file. This times they were used, and how much time each call took.
file descriptor is an integer, and its value normally starts This reveals that mkfifo calls mknod to create a file; indeed,
at 3, since file descriptors 0, 1 and 2 are assigned for the mkfifo is just a wrapper around the mknod system call. Thus,
Process 7
read ( )
Process 6
Process 2 Lock
write ( )
write( ) Record 1 Lock record 1
STOP
Process 4 Record 1
STOP
read ( ) Unlock record 1
Record 2 Lock record 2
Process 5
Record 2
write ( )
Unlock record 2
Lock record 3
Process 1 Record 3
Process 1 Record 3
write( ) write ( ) Unlock record 3
Lock record 4
Record 4
Process 5
Process 3 Record 4
read ( ) Unlock record 4
STOP read ( ) Lock record 5
Record 5 Record 5
Unlock record 5
Unlock
Process 3
Process 4
write( ) File Process 2 File STOP
read ( )
write ( )
Figure 10: Non-shared lock—mandatory file locking Figure 12: Shared lock—advisory or record locking
Process 4
write ( )
Lock fcntl Implementation flock Implementation
Record 1 struct flock lock; flock ( fd, LOCK_EX );
STOP
lock.l_type = F_WRLCK; …….critical section……
Process 2 lock.l_whence = SEEK_SET;
Record 2
read ( ) lock.l_start = nth record; flock ( fd, LOCK_UN );
lock.l_len = sizeof (record);
lock.l_pid = getpid( );
Process 1 Record 3
fcntl ( fd, F_SETLKW, &lock ); flock ( fd, LOCK_SH );
read ( )
…….critical section…… …….critical section……
Record 4
lock.l_type = F_UNLCK; flock ( fd, LOCK_UN );
Process 3 Unlock
read ( )
File
Figure 11: Shared lock—mandatory file locking Figure 13: Sketch of file-locking implementation
Let's look at the members of the struct: If the process that is holding the lock exits abnormally,
l_type specifies the lock type, and is one of the then the kernel will release the lock, and the next
following: read lock (F_RDLCK), write lock (F_ deserving process can acquire the lock.
WRLCK) or unlock a locked file (F_UNLCK). A read If we want to lock the entire file, we set the l_start and l_
lock will allow many readers to access the file at the len fields to 0. To improve locking granularity, however, each
same time, but no writer is allowed. A write lock will process can lock a specific record, or range of bytes in a file,
allow only one writer at a time. by setting appropriate values in the l_start and l_len fields.
1. l_whence is used to set the file pointer relative to the Figure 13 explains how to approach this design of locking a
current cursor position. If, for example, you want to go specific record or byte, by the suitable setting of the l_start
to the starting location of a file, then pass SEEK_SET, or and l_len fields.
if the file is already open and you subsequently want to The locking and unlocking is done by calling the fcntl
set the lock from the current cursor position, then pass system call. Its syntax is:
SEEK_CUR. If you want to go to the end of the file, then
pass SEEK_END. int fcntl (int fd, int cmd, struct flock *lock);
2. l_start member of the structure is used to tell the starting
point of the file, which could be from the 10th byte. We have three commands: F_SETLKW, F_SETLK and
3. l_len is used to specify till what point you want to lock the F_GETLK to acquire, release or check the availability of
file. It could be till the 100th byte. the lock. The first two commands are used to acquire or
l_pid is the process ID of the process that holds the lock. release the lock, but the difference between them is that
SETLKW will wait till the lock is released, if it's held by • LOCK_UN (remove an existing lock)
another process, but SETLK will return with an error if the Examples of file-locking implementations of both fcntl
lock is not available. F_GETLK will just check if the lock and flock types are shown in Figure 13.
is available. To sum up, let's look at the procedure for lock usage:
a. Declare a lock
File-locking implementations: flock b. Initialise the lock
The usage of flock is simpler than fcntl. On the flip side, flock c. Lock the critical section
does not have many features that a developer needs. Use flock d. Unlock the critical section
just to lock or unlock, by passing a suitable operation flag. In this article, we have looked at how a file is accessed on
The flock command manages locks within shell scripts, or at the Linux platform, and its complete sequence. We showed
the command line. The flock system call is used to apply or different file-related system calls, and discussed how they
remove an advisory lock on an open file. Information on the work. We explained different file-synchronisation mechanisms
flock system call is provided below. available on Linux, and the procedures to access different
Syntax: kinds of files. This information is valuable to software
architects who create entire software products on the Linux
int flock (int fd, int operation); platform.
Available operations:
Acknowledgement
• LOCK_SH (shared lock, more than one process can hold)
The authors would like to acknowledge Krishna Sudhakaran for
• LOCK_NB (don’t block when locking)
his valuable assistance in the preparation of the view graphs.
• LOCK_EX (exclusive lock; only one process at a time)
References
1. Kernel Corner: Starting with Linux Device Drivers, by Dr B. Thangaraju, LINUX For You, May 2003, p 83
2. Linux supported file systems: http://www.linux-tutorial.info/modules.php?name=MContent&pageid=243
3. List of file systems: http://en.wikipedia.org/wiki/List_of_file_systems
4. A comparison of file systems: http://en.wikipedia.org/wiki/Comparison_of_file_systems
5. Ext4 fs features: http://kernelnewbies.org/Ext4
6. Understanding the Linux Kernel, (3rd Edition) by Daniel P. Bovet, Marco Cesati, O'Reilly Publications, 2005
7. extundelete details: http://extundelete.sourceforge.net/
8. gparted documentation: http://gparted.sourceforge.net/documentation.php
9. Kernel Corner: Examining Process Information, by Dr B. Thangaraju, LINUX For You, January 2004, pgs 84-87
10. Kernel Corner: Interfacing the proc file system with a kernel module, by Dr B. Thangaraju, LINUX For You, February 2004, pgs 90-92
11. Kernel Corner: Linux Kernel Locking Mechanisms for Kernel Programming, by Dr B. Thangaraju, LINUX For You, September 2003,
pgs 81-83;
12. Basics of System V Semaphore, by Dr B. Thangaraju, LINUX For You, July 2006, pgs 86-89;
13. The Intricacies of System V Semaphore, by V. Shobana and Dr B. Thangaraju, LINUX For You, April 2009, pgs 40-43.