Date: Sat, 28 Aug 1999 12:52:42 +0930 From: Greg Lehey <grog@lemis.com> To: FreeBSD Committers <cvs-committers@FreeBSD.org>, FreeBSD Hackers <hackers@FreeBSD.org> Subject: locking revisited Message-ID: <19990828125241.G13904@freebie.lemis.com>
next in thread | raw e-mail | index | archive | help
--OgqxwSJOaUobr8KG Content-Type: text/plain; charset=us-ascii After all the stuff that has been said on the last locking thread, I think it's better to restate the case than follow up. It's obvious from the messages in the last thread that a number of otherwise clever people have little understanding or knowledge of the concepts of file locking. I'm appending a hastily worked-over version of the section about locking from "Porting UNIX Software". Here's a summary of what I've been trying to say: All systems which do more than one thing at a time need file locking at some time or another. Since it involves cooperation between potentially unrelated processes, it's an obvious kernel function. Any "solution" requiring cooperation between processes isn't really a solution. As a result, I don't consider advisory locking to be real locking: it's just a kludge. FreeBSD is one of the few operating systems which doesn't have kernel-level locking. If we want to emulate other systems correctly, we *must* have advisory locking. This includes SCO UNIX, System V.4 and Linux. I suspect it also includes Microsoft. All this doesn't leave too much room for arguments about whether locking works or not: it works on all platforms except FreeBSD, and that's only because FreeBSD doesn't implement locking. As a result, I argue that we should implement locking. The questions are: how? I'd suggest three methods which can be individually enabled via sysctls: - System V style. We need this for compatibility with System V. The choice of mandatory or advisory locking depends on the file permissions. - Only mandatory locking. fcntl works as before, but locks are always mandatory, not advisory. I'm sure that this won't be popular, at least initially, but if you don't like it, you don't have to use it.y - Via separate calls to fcntl. fcntl currently has the following command values: #define F_DUPFD 0 /* duplicate file descriptor */ #define F_GETFD 1 /* get file descriptor flags */ #define F_SETFD 2 /* set file descriptor flags */ #define F_GETFL 3 /* get file status flags */ #define F_SETFL 4 /* set file status flags */ #define F_GETOWN 5 /* get SIGIO/SIGURG proc/pgrp */ #define F_SETOWN 6 /* set SIGIO/SIGURG proc/pgrp */ #define F_GETLK 7 /* get record locking information */ #define F_SETLK 8 /* set record locking information */ #define F_SETLKW 9 /* F_SETLK; wait if blocked */ We could add a F_SETMANDLOCK or some such. Any thoughts? Greg -- See complete headers for address, home page and phone numbers finger grog@lemis.com for PGP public key --OgqxwSJOaUobr8KG Content-Type: text/plain Content-Disposition: attachment; filename="locking.txt" Content-Transfer-Encoding: quoted-printable File locking F=08File locking ____________ The Seventh Edition did not originally allow programs to coordinate concur= rent access to a file. If two users both had a file open for modification at = the same time, it was almost impossible to prevent disaster. This is an obv= ious disadvantage, and all modern versions of UNIX supply some form of file lock= ing. Before we look at the functions that are available, it's a good ide= a to consider the various kinds of lock. There seem to be two of everyth= ing. First, the granularity is of interest: o file locking applies to the whole file. o range locking applies only to a range of byte offsets. This is somet= imes misleadingly called record locking. With file locking, no other process can access the file when a lock is appl= ied. With range locking, multiple locks can coexist as long as their ranges d= on't overlap. Secondly, there are two types of lock: o Advisory locks do not actually prevent access to the file. They work onl= y if every participating process ensures that it locks the file before acces= sing it. If the file is already locked, the process blocks until it gains = the lock. o mandatory locks prevent (block) read and write access to the file, but do= not stop it from being removed or renamed. Many editors do just this, so = even mandatory locking has its limitations. Finally, there are also two ways in which locks cooperate with each other: o exclusive locks allow no other locks that overlap the range. This is= the only was to perform file locking, and it implies that only a single pro= cess can access the file at a time. These locks are also called also called w= rite locks. Pa= ge 1 File locking o shared locks allow other shared locks to coexist with them. Their = main purpose is to prevent an exclusive lock from being applied. In combina= tion with mandatory range locking, a write is not permitted to a range covered= by a shared lock. These locks are also called read locks. There are five different kinds of file or record locking in common use: o Lock files, also called dot locking, is a primitive workaround use= d by communication programs such as uucp and getty. It is independent of = the system platform, but since it is frequently used we'll look at it brie= fly. It implements advisory file locking. o After the initial release of the Seventh Edition, a file locking pac= kage using the system call locking was introduced. It is still in use toda= y on XENIX systems. It implements mandatory range locking. o BSD systems have the system call flock. It implements advisory file lock= ing. o System V, POSIX.1, and more recent versions of BSD support range locking= via the fcntl system call. BSD and POSIX.1 systems provide only advi= sory locking. System V supplies a choice of advisory or mandatory lock= ing, depending on the file permissions. If you need to rewrite locking code, = this is the method you should use. o System V also supplies range locking via the lockf library call. Again= , it supplies a choice of advisory or mandatory locking, depending on the = file permissions. The decision between advisory and mandatory locking in System V depends on= the file permissions and not on the call to fcntl or lockf. The setgid bit is = used for this purpose. Normally, in executables, the setgid bit specifies that= the executable should assume the effective group ID of its owner group when exe= ced. On files that do not have group execute permission, it specifies manda= tory locking if it is set, and advisory locking if it is not set. For example, o A file with permissions 0764 (rwxrw-r--) will be locked with advi= sory locking, since its permissions include neither group execute nor setgid. o A file with permissions 0774 (rwxrwxr--) will be locked with advi= sory locking, since its permissions don't include setgid. o A file with permissions 02774 (rwxrwsr--) will be locked with advi= sory locking, since its permissions include both group execute and setgid. Page 2 File loc= king o A file with permissions 02764 will be locked with mandatory locking, sinc= e it has the setgid bit set, but group execute is not set. If you list = the permissions of this file with ls -l, you get rwxrwlr-- on a System V sys= tem, but many versions of ls, including BSD and GNU versions, will list rwxrwS= r--. Lock files __________ Lock files are the traditional method that uucp uses for locking serial li= nes. Serial lines are typically used either for dialing out, for example with u= ucp, or dialing in, which is handled by a program of the getty family. Some kin= d of synchronization is needed to ensure that both of these programs don't try= to access the line at the same time. The other forms of locking we describe = only apply to disk files, so we can't use them. Instead, uucp and getty create = lock files. A typical lock file will have a name like /var/spool/uucp/LCK..t= tyb, and for some reason these double periods in the name have led to the term = dot locking. The locking algorithm is straightforward: if a process wants to access a se= rial line /dev/ttyb, it looks for a file /var/spool/uucp/LCK..ttyb. If it finds= it, it checks the contents, which specify the process ID of the owner, and ch= ecks if the owner still exists. If it does, the file is locked, and the pro= cess can't access the serial line. If the file doesn't exist, or if the owne= r no longer exists, the process creates the file if necessary and puts its = own process ID in the file. Although the algorithm is straightforward, the naming conventions are anyt= hing but standardized. When porting software from other platforms, it is absolu= tely essential that all programs using dot locking should be agreed on the lock = file name and its format. Let's look at the lock file names for the de= vice /dev/ttyb, which is major device number 29, minor device number 1. The l= s -l listing looks like: $ ls -l /dev/ttyb crw-rw-rw- 1 root sys 29, 1 Feb 25 1995 /dev/ttyb Pa= ge 3 File locking This describes common conventions: | | System | Name | PID format -----------+--------------------------------+----------------- 4.3BSD | /usr/spool/uucp/LCK..ttyb | binary, 4 bytes 4.4BSD | /var/spool/uucp/LCK..ttyb | binary, 4 bytes System V.3 | /usr/spool/uucp/LCK..ttyb | ASCII, 10 bytes System V.4 | /var/spool/uucp/LK.032.029.001 | ASCII, 10 bytes A couple of points to note are: o The digits in the lock file name for System V.4 are the major device nu= mber of the disk on which /dev is located (32), the major device number of= the serial device (29), and the minor device number of the serial device (1). o Some systems, such as SCO, have multiple names for terminal lines, depen= ding on the characteristics which it should exhibit. For example, /dev/t= ty1a refers to a line when running without modem control signals, and /dev/t= ty1A refers to the same line when running with modem control signals. Cle= arly only one of these lines can be used at the same time: by convention, the = lock file name for both devices is /usr/spool/uucp/LCK..tty1a. o The locations of the lock files vary considerably. Apart from those in= the table, other possibilities are /etc/locks/LCK..t= tyb, /usr/spool/locks/LCK..ttyb, and /usr/spool/uucp/LCK/LCK..ttyb. o Still other methods exist. See the file policy.h in the Taylor = uucp distribution for further discussion. Lock files are unreliable. It is quite possible for two processes to= go through this algorithm at the same time, both find that the lock file doe= sn't exist, both create it, and both put their process ID in it. The result is = not what you want. Lock files should only be used when there is reall= y no alternative. locking system call ___________________ locking comes from the original implementation introduced during the Sev= enth Edition. It is still available in XENIX. It implements mandatory r= ange locking. Page 4 File loc= king int locking (int fd, int mode, long size); locking locks a block of data of length size bytes, starting at the cur= rent position in the file. mode can have one of the following values: | Parameter | Meaning ----------+-------------------------------------------------------------- LK_LOCK | Obtain an exclusive lock for the specified block. If any | part is not available, sleep until it becomes available. LK_NBLCK | Obtain an exclusive lock for the specified block. If any | part is not available, the request fails, and errno is set | to EACCES. LK_NBRLCK | Obtains a shared lock for the specified block. If any part | is not available, the request fails, and errno is set to | EACCES. LK_RLCK | Obtain a shared lock for the specified block. If any part | is not available, sleep until it becomes available. LK_UNLCK | Unlock a previously locked block of data. Figure 1: locking operation codes flock _____ flock is the weakest of all the lock functions. It provides only advisory = file locking. #include <sys/file.h> (defined in sys/file.h) #define LOCK_SH 1 /* shared lock */ #define LOCK_EX 2 /* exclusive lock */ #define LOCK_NB 4 /* don't block when locking */ #define LOCK_UN 8 /* unlock */ int flock (int fd, int operation); flock applies or removes a lock on fd. By default, if a lock cannot= be granted, the process blocks until the lock is available. If you set the = flag LOCK_NB, flock returns immediately with errno set to EWOULDBLOCK if the = lock cannot be granted. Pa= ge 5 File locking fcntl locking _____________ fcntl is a function that can perform various functions on open files. A nu= mber of these functions perform advisory record locking, and System V also of= fers the option of mandatory locking. All locking functions operate on a st= ruct flock: struct flock { short l_type; /* lock type: read/write, etc. */ short l_whence; /* type of l_start */ off_t l_start; /* starting offset */ off_t l_len; /* len =3D 0 means until end of file */ long l_sysid; /* Only SVR4 */ pid_t l_pid; /* lock owner */ }; In this structure, o l_type specifies the type of the lock, listed below: | value | Function --------+------------------------------------- F_RDLCK | Acquire a read or shared lock. F_WRLCK | Acquire a write or exclusive lock. F_UNLCK | Clear the lock. Figure 2: flock.l_type values o The offset is specified in the same way as a file offset is specifie= d to lseek: flock->l_whence may be set to SEEK_SET (offset is from the begin= ning of the file), SEEK_CUR (offset is relative to the current position= ) or SEEK_EOF (offset is relative to the current end of file position). All fcntl lock operations use this struct, which is passed to fcntl as the = arg parameter. For example, to perform the operation F_FOOLK, you would write: struct flock flock; error =3D fcntl (myfile, F_FOOLK, &flock); The following fcntl operations relate to locking: Page 6 File loc= king o F_GETLK gets information on any current lock on the file. when calling,= you set the fields flock->l_type, flock->l_whence, flock->l_start, = and flock->l_len to the value of a lock that we want to set. If a lock = that would cause a lock request to block already exists, flock is overwritten = with information about the lock. The field flock->l_whence is set to SEEK_= SET, and flock->l_start is set to the offset in the file. flock->l_pid is set= to the pid of the process that owns the lock. If the lock can be gran= ted, flock->l_type is set to F_UNLK and the rest of the structure is = left unchanged, o F_SETLK tries to set a lock (flock->l_type set to F_RDLCK or F_WRLCK) o= r to reset a lock (flock->l_type set to F_UNLCK). If a lock cannot be obtai= ned, fcntl returns with errno set to EACCES (System V) or EAGAIN (BSD and POS= IX). o F_SETLKW works like F_SETLK, except that if the lock cannot be obtained, = the process blocks until it can be obtained. o System V.4 has a further function, F_FREESP, which uses the struct flock,= but in fact has nothing to do with file locking: it frees the space defined= by flock->l_whence, flock->l_start, and flock->l_len. The data in this par= t of the file is physically removed, a read access returns EOF, and a write ac= cess writes new data. The only reason this operation uses the struct flock = (and the reason we discuss it here) is because struct flock has suitable mem= bers to describe the area that needs to be freed. Many file systems allow dat= a to be freed only if the end of the region corresponds with the end of file,= in which case the call can be replaced with ftruncate. lockf _____ lockf is a library function supplied only with System V. Like fcntl= , it implements advisory or mandatory range locking based on the file permissi= ons. In some systems, it is implemented in terms of fcntl. It supports = only exclusive locks: #include <unistd.h> int lockf (int fd, int function, long size); The functions are similar to those supplied by fcntl. l_type specifies = the Pa= ge 7 File locking type of the lock, as shown below. | value | Function --------+-------------------------------------------- F_ULOCK | Unlock the range. F_LOCK | Acquire exclusive lock. F_TLOCK | Lock if possible, otherwise return status. F_TEST | Check range for other locks. Figure 3: lockf functions lockf does not specify a start offset for the range to be locked. Thi= s is always the current position in the file--you need to use lseek to get there= if you are not there already. The following code fragments are roughly equ= iva- lent: flock->ltype =3D F_WRLK; /* lockf only supports write locks */ flock->whence =3D SEEK_SET; flock->l_start =3D filepos; /* this was set elsewhere */ flock->l_len =3D reclen; /* the length to set */ error =3D fcntl (myfile, F_GETLK, &flock); =2E..and lseek (myfile, SEEK_SET, filepos); /* Seek the correct place in the file */ error =3D lockf (myfile, F_LOCK, reclen); Which locking scheme? _____________________ As we've seen, file locking is a can of worms. Many portable software pack= ages offer you a choice of locking mechanisms, and your system may supply a nu= mber of them. Which do you take? Here are some rules of thumb: o fcntl locking is the best choice, as long as your system and the pac= kage agree on what it means. On System V.3 and V.4, fcntl locking offers= the choice of mandatory or advisory locking, whereas on other systems it = only offers advisory locking. If your package expects to be able to set manda= tory locking, and you're running, say, 4.4BSD, the package may not work correc= tly. If this happens, you may have to choose flock locking instead. o If your system doesn't have fcntl locking, you will almost certainly = have either flock or lockf locking instead. If the package supports it, use = it. Pure BSD systems don't support lockf, but some versions simulate it. S= ince Page 8 File loc= king lockf can also be used to require mandatory locking, it's better to use f= lock on BSD systems and lockf on System V systems. o You'll probably not come across any packages which support locking. If= you do, and your system supports it, it's not a bad choice. o If all else fails, use lock files. This is a very poor option, though--= it's probably a better idea to consider a more modern kernel. Pa= ge 9 --OgqxwSJOaUobr8KG-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990828125241.G13904>