locking rules:
all may block
- BKL i_sem(inode) i_zombie(inode)
-lookup: yes yes no
-create: yes yes yes
-link: yes yes yes
-mknod: yes yes yes
-mkdir: yes yes yes
-unlink: yes yes yes
-rmdir: yes yes yes (see below)
-rename: yes yes (both) yes (both) (see below)
-readlink: no no no
-follow_link: no no no
-truncate: yes yes no (see below)
-setattr: yes if ATTR_SIZE no
-permssion: yes no no
-getattr: (see below)
-revalidate: no (see below)
- Additionally, ->rmdir() has i_zombie on victim and so does ->rename()
-in case when target exists and is a directory.
- ->rename() on directories has (per-superblock) ->s_vfs_rename_sem.
+ BKL i_sem(inode)
+lookup: yes yes
+create: yes yes
+link: yes yes
+mknod: yes yes
+mkdir: yes yes
+unlink: yes yes (both)
+rmdir: yes yes (both) (see below)
+rename: yes yes (all) (see below)
+readlink: no no
+follow_link: no no
+truncate: yes yes (see below)
+setattr: yes if ATTR_SIZE
+permssion: yes no
+getattr: (see below)
+revalidate: no (see below)
+setxattr: DOCUMENT_ME
+getxattr: DOCUMENT_ME
+removexattr: DOCUMENT_ME
+ Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_sem on
+victim.
+ cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
->revalidate(), it may be called both with and without the i_sem
-on dentry->d_inode. VFS never calls it with i_zombie on dentry->d_inode,
-but watch for other methods directly calling this one...
+on dentry->d_inode.
->truncate() is never called directly - it's a callback, not a
method. It's called by vmtruncate() - library function normally used by
->setattr(). Locking information above applies to that call (i.e. is
passed).
->getattr() is currently unused.
+See Documentation/filesystems/directory-locking for more detailed discussion
+of the locking scheme for directory operations.
+
--------------------------- super_operations ---------------------------
prototypes:
void (*read_inode) (struct inode *);
--- /dev/null
+ Locking scheme used for directory operations is based on two
+kinds of locks - per-inode (->i_sem) and per-filesystem (->s_vfs_rename_sem).
+
+ For our purposes all operations fall in 5 classes:
+
+1) read access. Locking rules: caller locks directory we are accessing.
+
+2) object creation. Locking rules: same as above.
+
+3) object removal. Locking rules: caller locks parent, finds victim,
+locks victim and calls the method.
+
+4) rename() that is _not_ cross-directory. Locking rules: caller locks
+the parent, finds source and target, if target already exists - locks it
+and then calls the method.
+
+5) cross-directory rename. The trickiest in the whole bunch. Locking
+rules:
+ * lock the filesystem
+ * lock parents in "ancestors first" order.
+ * find source and target.
+ * if old parent is equal to or is a descendent of target
+ fail with -ENOTEMPTY
+ * if new parent is equal to or is a descendent of source
+ fail with -ELOOP
+ * if target exists - lock it.
+ * call the method.
+
+
+The rules above obviously guarantee that all directories that are going to be
+read, modified or removed by method will be locked by caller.
+
+
+If no directory is its own ancestor, the scheme above is deadlock-free.
+Proof:
+
+ First of all, at any moment we have a partial ordering of the
+objects - A < B iff A is an ancestor of B.
+
+ That ordering can change. However, the following is true:
+
+(1) if operation different from cross-directory rename holds lock on A and
+ attempts to acquire lock on B, A will remain the parent of B until we
+ acquire the lock on B. (Proof: only cross-directory rename can change
+ the parent of object and it would have to lock the parent).
+
+(2) if cross-directory rename holds the lock on filesystem, order will not
+ change until rename acquires all locks. (Proof: other cross-directory
+ renames will be blocked on filesystem lock and we don't start changing
+ the order until we had acquired all locks).
+
+ Now consider the minimal deadlock. Each process is blocked on
+attempt to acquire some lock and already holds at least one lock. Let's
+consider the set of contended locks. First of all, filesystem lock is
+not contended, since any process blocked on it is not holding any locks.
+Thus all processes are blocked on ->i_sem.
+
+ Any contended object is either held by cross-directory rename or
+has a child that is also contended. Indeed, suppose that it is held by
+operation other than cross-directory rename. Then the lock this operation
+is blocked on belongs to child of that object due to (1).
+
+ It means that one of the operations is cross-directory rename.
+Otherwise the set of contended objects would be infinite - each of them
+would have a contended child and we had assumed that no object is its
+own descendent. Moreover, there is exactly one cross-directory rename
+(see above).
+
+ Consider the object blocking the cross-directory rename. One of
+its descendents is locked by cross-directory rename (otherwise we would again
+have an infinite set of of contended objects). But that means that means
+that cross-directory rename is taking locks out of order. Due to (2) the
+order hadn't changed since we had acquired filesystem lock. But locking
+rules for cross-directory rename guarantee that we do not try to acquire
+lock on descendent before the lock on ancestor. Contradiction. I.e.
+deadlock is impossible. Q.E.D.
+
+
+ These operations are guaranteed to avoid loop creation. Indeed,
+the only operation that could introduce loops is cross-directory rename.
+Since the only new (parent, child) pair added by rename() is (new parent,
+source), such loop would have to contain these objects and the rest of it
+would have to exist before rename(). I.e. at the moment of loop creation
+rename() responsible for that would be holding filesystem lock and new parent
+would have to be equal to or a descendent of source. But that means that
+new parent had been equal to or a descendent of source since the moment when
+we had acquired filesystem lock and rename() would fail with -ELOOP in that
+case.
+
+ While this locking scheme works for arbitrary DAGs, it relies on
+ability to check that directory is a descendent of another object. Current
+implementation assumes that directory graph is a tree. This assumption is
+also preserved by all operations (cross-directory rename on a tree that would
+not introduce a cycle will leave it a tree and link() fails for directories).
+
+ Notice that "directory" in the above == "anything that might have
+children", so if we are going to introduce hybrid objects we will need
+either to make sure that link(2) doesn't work for them or to make changes
+in is_subdir() that would make it work even in presense of such beasts.
break;
case 3: root = dget(file->f_vfsmnt->mnt_sb->s_root);
down(&root->d_inode->i_sem);
- down(&root->d_inode->i_zombie);
kill_node(e);
- up(&root->d_inode->i_zombie);
up(&root->d_inode->i_sem);
dput(root);
break;
if (IS_ERR(dentry))
goto out;
- down(&root->d_inode->i_zombie);
-
err = -EEXIST;
if (dentry->d_inode)
goto out2;
mntput(mnt);
err = 0;
out2:
- up(&root->d_inode->i_zombie);
dput(dentry);
out:
up(&root->d_inode->i_sem);
case 2: enabled = 1; break;
case 3: root = dget(file->f_vfsmnt->mnt_sb->s_root);
down(&root->d_inode->i_sem);
- down(&root->d_inode->i_zombie);
while (!list_empty(&entries))
kill_node(list_entry(entries.next, Node, list));
- up(&root->d_inode->i_zombie);
up(&root->d_inode->i_sem);
dput(root);
default: return res;
INIT_LIST_HEAD(&inode->i_dirty_data_buffers);
INIT_LIST_HEAD(&inode->i_devices);
sema_init(&inode->i_sem, 1);
- sema_init(&inode->i_zombie, 1);
spin_lock_init(&inode->i_data.i_shared_lock);
}
* hopefully we will be able to get rid of that wart in 2.5. So far only
* XEmacs seems to be relying on it...
*/
+/*
+ * [Sep 2001 AV] Single-semaphore locking scheme (kudos to David Holland)
+ * implemented. Let's see if raised priority of ->s_vfs_rename_sem gives
+ * any extra contention...
+ */
/* In order to reduce some races, while at the same time doing additional
* checking and hopefully speeding things up, we copy filenames to the
return retval;
}
-int vfs_create(struct inode *dir, struct dentry *dentry, int mode)
+/*
+ * p1 and p2 should be directories on the same fs.
+ */
+struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
{
- int error;
+ struct dentry *p;
- mode &= S_IALLUGO;
- mode |= S_IFREG;
+ if (p1 == p2) {
+ down(&p1->d_inode->i_sem);
+ return NULL;
+ }
+
+ down(&p1->d_inode->i_sb->s_vfs_rename_sem);
+
+ for (p = p1; p->d_parent != p; p = p->d_parent) {
+ if (p->d_parent == p2) {
+ down(&p2->d_inode->i_sem);
+ down(&p1->d_inode->i_sem);
+ return p;
+ }
+ }
+
+ for (p = p2; p->d_parent != p; p = p->d_parent) {
+ if (p->d_parent == p1) {
+ down(&p1->d_inode->i_sem);
+ down(&p2->d_inode->i_sem);
+ return p;
+ }
+ }
+
+ down(&p1->d_inode->i_sem);
+ down(&p2->d_inode->i_sem);
+ return NULL;
+}
+
+void unlock_rename(struct dentry *p1, struct dentry *p2)
+{
+ up(&p1->d_inode->i_sem);
+ if (p1 != p2) {
+ up(&p2->d_inode->i_sem);
+ up(&p1->d_inode->i_sb->s_vfs_rename_sem);
+ }
+}
+
+int vfs_create(struct inode *dir, struct dentry *dentry, int mode)
+{
+ int error = may_create(dir, dentry);
- down(&dir->i_zombie);
- error = may_create(dir, dentry);
if (error)
- goto exit_lock;
+ return error;
- error = -EACCES; /* shouldn't it be ENOSYS? */
if (!dir->i_op || !dir->i_op->create)
- goto exit_lock;
+ return -EACCES; /* shouldn't it be ENOSYS? */
DQUOT_INIT(dir);
+
+ mode &= S_IALLUGO;
+ mode |= S_IFREG;
lock_kernel();
error = dir->i_op->create(dir, dentry, mode);
unlock_kernel();
-exit_lock:
- up(&dir->i_zombie);
if (!error)
inode_dir_notify(dir, DN_CREATE);
return error;
int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
{
- int error = -EPERM;
+ int error = may_create(dir, dentry);
- down(&dir->i_zombie);
- if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
- goto exit_lock;
-
- error = may_create(dir, dentry);
if (error)
- goto exit_lock;
+ return error;
+
+ if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
+ return -EPERM;
- error = -EPERM;
if (!dir->i_op || !dir->i_op->mknod)
- goto exit_lock;
+ return -EPERM;
DQUOT_INIT(dir);
lock_kernel();
error = dir->i_op->mknod(dir, dentry, mode, dev);
unlock_kernel();
-exit_lock:
- up(&dir->i_zombie);
if (!error)
inode_dir_notify(dir, DN_CREATE);
return error;
int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int error;
+ int error = may_create(dir, dentry);
- down(&dir->i_zombie);
- error = may_create(dir, dentry);
if (error)
- goto exit_lock;
+ return error;
- error = -EPERM;
if (!dir->i_op || !dir->i_op->mkdir)
- goto exit_lock;
+ return -EPERM;
DQUOT_INIT(dir);
mode &= (S_IRWXUGO|S_ISVTX);
lock_kernel();
error = dir->i_op->mkdir(dir, dentry, mode);
unlock_kernel();
-
-exit_lock:
- up(&dir->i_zombie);
if (!error)
inode_dir_notify(dir, DN_CREATE);
return error;
int vfs_rmdir(struct inode *dir, struct dentry *dentry)
{
- int error;
+ int error = may_delete(dir, dentry, 1);
- error = may_delete(dir, dentry, 1);
if (error)
return error;
DQUOT_INIT(dir);
- double_down(&dir->i_zombie, &dentry->d_inode->i_zombie);
+ down(&dentry->d_inode->i_sem);
d_unhash(dentry);
if (IS_DEADDIR(dir))
error = -ENOENT;
if (!error)
dentry->d_inode->i_flags |= S_DEAD;
}
- double_up(&dir->i_zombie, &dentry->d_inode->i_zombie);
+ up(&dentry->d_inode->i_sem);
if (!error) {
inode_dir_notify(dir, DN_DELETE);
d_delete(dentry);
int vfs_unlink(struct inode *dir, struct dentry *dentry)
{
- int error;
+ int error = may_delete(dir, dentry, 0);
- down(&dir->i_zombie);
- error = may_delete(dir, dentry, 0);
- if (!error) {
- error = -EPERM;
- if (dir->i_op && dir->i_op->unlink) {
- DQUOT_INIT(dir);
- if (d_mountpoint(dentry))
- error = -EBUSY;
- else {
- lock_kernel();
- error = dir->i_op->unlink(dir, dentry);
- unlock_kernel();
- if (!error)
- d_delete(dentry);
- }
- }
+ if (error)
+ return error;
+
+ if (!dir->i_op || !dir->i_op->unlink)
+ return -EPERM;
+
+ DQUOT_INIT(dir);
+
+ dget(dentry);
+ down(&dentry->d_inode->i_sem);
+ if (d_mountpoint(dentry))
+ error = -EBUSY;
+ else {
+ lock_kernel();
+ error = dir->i_op->unlink(dir, dentry);
+ unlock_kernel();
+ if (!error)
+ d_delete(dentry);
}
- up(&dir->i_zombie);
+ up(&dentry->d_inode->i_sem);
+ dput(dentry);
+
if (!error)
inode_dir_notify(dir, DN_DELETE);
+
return error;
}
int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
{
- int error;
+ int error = may_create(dir, dentry);
- down(&dir->i_zombie);
- error = may_create(dir, dentry);
if (error)
- goto exit_lock;
+ return error;
- error = -EPERM;
if (!dir->i_op || !dir->i_op->symlink)
- goto exit_lock;
+ return -EPERM;
DQUOT_INIT(dir);
lock_kernel();
error = dir->i_op->symlink(dir, dentry, oldname);
unlock_kernel();
-
-exit_lock:
- up(&dir->i_zombie);
if (!error)
inode_dir_notify(dir, DN_CREATE);
return error;
int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)
{
- struct inode *inode;
+ struct inode *inode = old_dentry->d_inode;
int error;
- down(&dir->i_zombie);
- error = -ENOENT;
- inode = old_dentry->d_inode;
if (!inode)
- goto exit_lock;
+ return -ENOENT;
error = may_create(dir, new_dentry);
if (error)
- goto exit_lock;
+ return error;
- error = -EXDEV;
if (dir->i_sb != inode->i_sb)
- goto exit_lock;
+ return -EXDEV;
/*
* A link to an append-only or immutable file cannot be created.
*/
- error = -EPERM;
if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
- goto exit_lock;
+ return -EPERM;
if (!dir->i_op || !dir->i_op->link)
- goto exit_lock;
+ return -EPERM;
DQUOT_INIT(dir);
lock_kernel();
error = dir->i_op->link(old_dentry, dir, new_dentry);
unlock_kernel();
-
-exit_lock:
- up(&dir->i_zombie);
if (!error)
inode_dir_notify(dir, DN_CREATE);
return error;
* story.
* c) we have to lock _three_ objects - parents and victim (if it exists).
* And that - after we got ->i_sem on parents (until then we don't know
- * whether the target exists at all, let alone whether it is a directory
- * or not). Solution: ->i_zombie. Taken only after ->i_sem. Always taken
- * on link creation/removal of any kind. And taken (without ->i_sem) on
- * directory that will be removed (both in rmdir() and here).
+ * whether the target exists). Solution: try to be smart with locking
+ * order for inodes. We rely on the fact that tree topology may change
+ * only under ->s_vfs_rename_sem _and_ that parent of the object we
+ * move will be locked. Thus we can rank directories by the tree
+ * (ancestors first) and rank all non-directories after them.
+ * That works since everybody except rename does "lock parent, lookup,
+ * lock child" and rename is under ->s_vfs_rename_sem.
+ * HOWEVER, it relies on the assumption that any object with ->lookup()
+ * has no more than 1 dentry. If "hybrid" objects will ever appear,
+ * we'd better make sure that there's no link(2) for them.
* d) some filesystems don't support opened-but-unlinked directories,
* either because of layout or because they are not ready to deal with
* all cases correctly. The latter will be fixed (taking this sort of
* stuff into VFS), but the former is not going away. Solution: the same
* trick as in rmdir().
* e) conversion from fhandle to dentry may come in the wrong moment - when
- * we are removing the target. Solution: we will have to grab ->i_zombie
+ * we are removing the target. Solution: we will have to grab ->i_sem
* in the fhandle_to_dentry code. [FIXME - current nfsfh.c relies on
* ->i_sem on parents, which works but leads to some truely excessive
* locking].
int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry)
{
- int error;
+ int error = 0;
struct inode *target;
- if (old_dentry->d_inode == new_dentry->d_inode)
- return 0;
-
- error = may_delete(old_dir, old_dentry, 1);
- if (error)
- return error;
-
- if (new_dir->i_sb != old_dir->i_sb)
- return -EXDEV;
-
- if (!new_dentry->d_inode)
- error = may_create(new_dir, new_dentry);
- else
- error = may_delete(new_dir, new_dentry, 1);
- if (error)
- return error;
-
- if (!old_dir->i_op || !old_dir->i_op->rename)
- return -EPERM;
-
/*
* If we are going to change the parent - check write permissions,
* we'll need to flip '..'.
*/
- if (new_dir != old_dir) {
+ if (new_dir != old_dir)
error = permission(old_dentry->d_inode, MAY_WRITE);
- }
+
if (error)
return error;
- DQUOT_INIT(old_dir);
- DQUOT_INIT(new_dir);
- down(&old_dir->i_sb->s_vfs_rename_sem);
- error = -EINVAL;
- if (is_subdir(new_dentry, old_dentry))
- goto out_unlock;
- /* Don't eat your daddy, dear... */
- /* This also avoids locking issues */
- if (old_dentry->d_parent == new_dentry)
- goto out_unlock;
target = new_dentry->d_inode;
- if (target) { /* Hastur! Hastur! Hastur! */
- triple_down(&old_dir->i_zombie,
- &new_dir->i_zombie,
- &target->i_zombie);
+ if (target) {
+ down(&target->i_sem);
d_unhash(new_dentry);
- } else
- double_down(&old_dir->i_zombie,
- &new_dir->i_zombie);
- if (IS_DEADDIR(old_dir)||IS_DEADDIR(new_dir))
- error = -ENOENT;
- else if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
+ }
+ if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
error = -EBUSY;
else
error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
if (target) {
if (!error)
target->i_flags |= S_DEAD;
- triple_up(&old_dir->i_zombie,
- &new_dir->i_zombie,
- &target->i_zombie);
+ up(&target->i_sem);
if (d_unhashed(new_dentry))
d_rehash(new_dentry);
dput(new_dentry);
- } else
- double_up(&old_dir->i_zombie,
- &new_dir->i_zombie);
-
+ }
if (!error)
d_move(old_dentry,new_dentry);
-out_unlock:
- up(&old_dir->i_sb->s_vfs_rename_sem);
return error;
}
int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry)
{
+ struct inode *target;
int error;
- if (old_dentry->d_inode == new_dentry->d_inode)
- return 0;
+ dget(new_dentry);
+ target = new_dentry->d_inode;
+ if (target)
+ down(&target->i_sem);
+ if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
+ error = -EBUSY;
+ else
+ error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
+ if (!error) {
+ /* The following d_move() should become unconditional */
+ if (!(old_dir->i_sb->s_type->fs_flags & FS_ODD_RENAME)) {
+ d_move(old_dentry, new_dentry);
+ }
+ }
+ if (target)
+ up(&target->i_sem);
+ dput(new_dentry);
+ return error;
+}
- error = may_delete(old_dir, old_dentry, 0);
+int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry)
+{
+ int error;
+ int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
+
+ if (old_dentry->d_inode == new_dentry->d_inode)
+ return 0;
+
+ error = may_delete(old_dir, old_dentry, is_dir);
if (error)
return error;
- if (new_dir->i_sb != old_dir->i_sb)
- return -EXDEV;
-
if (!new_dentry->d_inode)
error = may_create(new_dir, new_dentry);
else
- error = may_delete(new_dir, new_dentry, 0);
+ error = may_delete(new_dir, new_dentry, is_dir);
if (error)
return error;
if (!old_dir->i_op || !old_dir->i_op->rename)
return -EPERM;
+ if (IS_DEADDIR(old_dir)||IS_DEADDIR(new_dir))
+ return -ENOENT;
DQUOT_INIT(old_dir);
DQUOT_INIT(new_dir);
- double_down(&old_dir->i_zombie, &new_dir->i_zombie);
- if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
- error = -EBUSY;
- else
- error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
- double_up(&old_dir->i_zombie, &new_dir->i_zombie);
- if (error)
- return error;
- /* The following d_move() should become unconditional */
- if (!(old_dir->i_sb->s_type->fs_flags & FS_ODD_RENAME)) {
- d_move(old_dentry, new_dentry);
- }
- return 0;
-}
-int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
- struct inode *new_dir, struct dentry *new_dentry)
-{
- int error;
- if (S_ISDIR(old_dentry->d_inode->i_mode))
+ if (is_dir)
error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
else
error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
int error = 0;
struct dentry * old_dir, * new_dir;
struct dentry * old_dentry, *new_dentry;
+ struct dentry * trap;
struct nameidata oldnd, newnd;
if (path_init(oldname, LOOKUP_PARENT, &oldnd))
if (newnd.last_type != LAST_NORM)
goto exit2;
- double_lock(new_dir, old_dir);
+ trap = lock_rename(new_dir, old_dir);
old_dentry = lookup_hash(&oldnd.last, old_dir);
error = PTR_ERR(old_dentry);
if (newnd.last.name[newnd.last.len])
goto exit4;
}
+ /* source should not be ancestor of target */
+ error = -EINVAL;
+ if (old_dentry == trap)
+ goto exit4;
new_dentry = lookup_hash(&newnd.last, new_dir);
error = PTR_ERR(new_dentry);
if (IS_ERR(new_dentry))
goto exit4;
+ /* target should not be an ancestor of source */
+ error = -ENOTEMPTY;
+ if (new_dentry == trap)
+ goto exit5;
lock_kernel();
error = vfs_rename(old_dir->d_inode, old_dentry,
new_dir->d_inode, new_dentry);
unlock_kernel();
+exit5:
dput(new_dentry);
exit4:
dput(old_dentry);
exit3:
- double_up(&new_dir->d_inode->i_sem, &old_dir->d_inode->i_sem);
+ unlock_rename(new_dir, old_dir);
exit2:
path_release(&newnd);
exit1:
return -ENOTDIR;
err = -ENOENT;
- down(&nd->dentry->d_inode->i_zombie);
+ down(&nd->dentry->d_inode->i_sem);
if (IS_DEADDIR(nd->dentry->d_inode))
goto out_unlock;
}
spin_unlock(&dcache_lock);
out_unlock:
- up(&nd->dentry->d_inode->i_zombie);
+ up(&nd->dentry->d_inode->i_sem);
return err;
}
goto out;
err = -ENOENT;
- down(&nd->dentry->d_inode->i_zombie);
+ down(&nd->dentry->d_inode->i_sem);
if (IS_DEADDIR(nd->dentry->d_inode))
goto out1;
out2:
spin_unlock(&dcache_lock);
out1:
- up(&nd->dentry->d_inode->i_zombie);
+ up(&nd->dentry->d_inode->i_sem);
out:
up_write(¤t->namespace->sem);
if (!err)
user_nd.dentry = dget(current->fs->root);
read_unlock(¤t->fs->lock);
down_write(¤t->namespace->sem);
- down(&old_nd.dentry->d_inode->i_zombie);
+ down(&old_nd.dentry->d_inode->i_sem);
error = -EINVAL;
if (!check_mnt(user_nd.mnt))
goto out2;
path_release(&root_parent);
path_release(&parent_nd);
out2:
- up(&old_nd.dentry->d_inode->i_zombie);
+ up(&old_nd.dentry->d_inode->i_sem);
up_write(¤t->namespace->sem);
path_release(&user_nd);
path_release(&old_nd);
nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
struct svc_fh *tfhp, char *tname, int tlen)
{
- struct dentry *fdentry, *tdentry, *odentry, *ndentry;
+ struct dentry *fdentry, *tdentry, *odentry, *ndentry, *trap;
struct inode *fdir, *tdir;
int err;
/* cannot use fh_lock as we need deadlock protective ordering
* so do it by hand */
- double_down(&tdir->i_sem, &fdir->i_sem);
+ trap = lock_rename(tdentry, fdentry);
ffhp->fh_locked = tfhp->fh_locked = 1;
fill_pre_wcc(ffhp);
fill_pre_wcc(tfhp);
err = -ENOENT;
if (!odentry->d_inode)
goto out_dput_old;
+ err = -EINVAL;
+ if (odentry == trap)
+ goto out_dput_old;
ndentry = lookup_one_len(tname, tdentry, tlen);
err = PTR_ERR(ndentry);
if (IS_ERR(ndentry))
goto out_dput_old;
-
+ err = -ENOTEMPTY;
+ if (ndentry == trap)
+ goto out_dput_new;
#ifdef MSNFS
if ((ffhp->fh_export->ex_flags & NFSEXP_MSNFS) &&
}
dput(ndentry);
+ out_dput_new:
+ dput(ndentry);
out_dput_old:
dput(odentry);
out_nfserr:
*/
fill_post_wcc(ffhp);
fill_post_wcc(tfhp);
- double_up(&tdir->i_sem, &fdir->i_sem);
+ unlock_rename(tdentry, fdentry);
ffhp->fh_locked = tfhp->fh_locked = 0;
-
+
out:
return err;
}
if (!file->f_op || !file->f_op->readdir)
goto out;
down(&inode->i_sem);
- down(&inode->i_zombie);
res = -ENOENT;
if (!IS_DEADDIR(inode)) {
lock_kernel();
res = file->f_op->readdir(file, buf, filler);
unlock_kernel();
}
- up(&inode->i_zombie);
up(&inode->i_sem);
out:
return res;
unsigned long i_blocks;
unsigned long i_version;
struct semaphore i_sem;
- struct semaphore i_zombie;
struct inode_operations *i_op;
struct file_operations *i_fop; /* former ->i_op->default_file_ops */
struct super_block *i_sb;
extern int vfs_unlink(struct inode *, struct dentry *);
extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
+extern struct dentry *lock_rename(struct dentry *, struct dentry *);
+extern void unlock_rename(struct dentry *, struct dentry *);
+
/*
* File types
*/
extern int inode_change_ok(struct inode *, struct iattr *);
extern int inode_setattr(struct inode *, struct iattr *);
-/*
- * Common dentry functions for inclusion in the VFS
- * or in other stackable file systems. Some of these
- * functions were in linux/fs/ C (VFS) files.
- *
- */
-
-/*
- * Locking the parent is needed to:
- * - serialize directory operations
- * - make sure the parent doesn't change from
- * under us in the middle of an operation.
- *
- * NOTE! Right now we'd rather use a "struct inode"
- * for this, but as I expect things to move toward
- * using dentries instead for most things it is
- * probably better to start with the conceptually
- * better interface of relying on a path of dentries.
- */
-static inline struct dentry *lock_parent(struct dentry *dentry)
-{
- struct dentry *dir = dget(dentry->d_parent);
-
- down(&dir->d_inode->i_sem);
- return dir;
-}
-
-static inline struct dentry *get_parent(struct dentry *dentry)
-{
- return dget(dentry->d_parent);
-}
-
-static inline void unlock_dir(struct dentry *dir)
-{
- up(&dir->d_inode->i_sem);
- dput(dir);
-}
-
-/*
- * Whee.. Deadlock country. Happily there are only two VFS
- * operations that does this..
- */
-static inline void double_down(struct semaphore *s1, struct semaphore *s2)
-{
- if (s1 != s2) {
- if ((unsigned long) s1 < (unsigned long) s2) {
- struct semaphore *tmp = s2;
- s2 = s1; s1 = tmp;
- }
- down(s1);
- }
- down(s2);
-}
-
-/*
- * Ewwwwwwww... _triple_ lock. We are guaranteed that the 3rd argument is
- * not equal to 1st and not equal to 2nd - the first case (target is parent of
- * source) would be already caught, the second is plain impossible (target is
- * its own parent and that case would be caught even earlier). Very messy.
- * I _think_ that it works, but no warranties - please, look it through.
- * Pox on bloody lusers who mandated overwriting rename() for directories...
- */
-
-static inline void triple_down(struct semaphore *s1,
- struct semaphore *s2,
- struct semaphore *s3)
-{
- if (s1 != s2) {
- if ((unsigned long) s1 < (unsigned long) s2) {
- if ((unsigned long) s1 < (unsigned long) s3) {
- struct semaphore *tmp = s3;
- s3 = s1; s1 = tmp;
- }
- if ((unsigned long) s1 < (unsigned long) s2) {
- struct semaphore *tmp = s2;
- s2 = s1; s1 = tmp;
- }
- } else {
- if ((unsigned long) s1 < (unsigned long) s3) {
- struct semaphore *tmp = s3;
- s3 = s1; s1 = tmp;
- }
- if ((unsigned long) s2 < (unsigned long) s3) {
- struct semaphore *tmp = s3;
- s3 = s2; s2 = tmp;
- }
- }
- down(s1);
- } else if ((unsigned long) s2 < (unsigned long) s3) {
- struct semaphore *tmp = s3;
- s3 = s2; s2 = tmp;
- }
- down(s2);
- down(s3);
-}
-
-static inline void double_up(struct semaphore *s1, struct semaphore *s2)
-{
- up(s1);
- if (s1 != s2)
- up(s2);
-}
-
-static inline void triple_up(struct semaphore *s1,
- struct semaphore *s2,
- struct semaphore *s3)
-{
- up(s1);
- if (s1 != s2)
- up(s2);
- up(s3);
-}
-
-static inline void double_lock(struct dentry *d1, struct dentry *d2)
-{
- double_down(&d1->d_inode->i_sem, &d2->d_inode->i_sem);
-}
-
-static inline void double_unlock(struct dentry *d1, struct dentry *d2)
-{
- double_up(&d1->d_inode->i_sem,&d2->d_inode->i_sem);
- dput(d1);
- dput(d2);
-}
-
#endif /* __KERNEL__ */
#endif /* _LINUX_FS_H */
EXPORT_SYMBOL(vfs_fstat);
EXPORT_SYMBOL(vfs_stat);
EXPORT_SYMBOL(vfs_lstat);
+EXPORT_SYMBOL(lock_rename);
+EXPORT_SYMBOL(unlock_rename);
EXPORT_SYMBOL(generic_read_dir);
EXPORT_SYMBOL(generic_file_llseek);
EXPORT_SYMBOL(remote_llseek);