lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 25 Aug 2010 14:03:50 +0900
From:	"J. R. Okajima" <hooanon05@...oo.co.jp>
To:	Valerie Aurora <vaurora@...hat.com>
Cc:	Neil Brown <neilb@...e.de>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Miklos Szeredi <miklos@...redi.hu>,
	Jan Blunck <jblunck@...e.de>,
	Christoph Hellwig <hch@...radead.org>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 14/39] union-mount: Union mounts documentation


Valerie Aurora:
> No, that's not a sufficient description and leaves open questions
> about all sorts of deadlocks and race conditions.  For example,
> inotify events occur while holding locks only on one layer.  You
> obviously need to lock the top layer to update the inheritance and
> parent-child relationships.  Now you are locking the lower layer first
> and the top layer second, which is the reverse of the usual order.

I don't agree about deadlock and race condition.
When user modifies the dir hierarchy on the layer directly during
aufs_rename() is running, aufs will detect it after lock_rename().
It behaves like this.
- decide the layer where actual rename operates. create the dir
  hierarchy on it if necessary.
- lock_rename() for the layer
- calls ->rename()
or
- if the renaming file exists on the lower readonly layer, aufs will
  copyup it to the upper writable layer as the rename target name.
  In this case, ->rename() is not called.

If a user changes the dir hierarchy directly on the layer before
aufs_rename(), then the notify event tells aufs it and aufs gets the
latetst hierarchy.

If it happens before lock_rename() in aufs_rename(), aufs verifies the
relationship between the target child and the locked dir. if it differs,
return EBUSY. Of course, lock_rename() follows the "ancestors first"
order described in Documentation/filesystem/directory-locking.


> around on the lower layer is safe.  In general, your first task is to
> show a global lock ordering to prove lack of deadlocks (which I don't
> think you should spend time on because most VFS experts think it is
> impossible to do with two read-write layers).

Since you may not read this anymore and other people doesn't seem to
be intrested in aufs, it may not be meaningful to write down about
locking in aufs. But I will try.

At first,
- since aufs is FS, it has its own super_block, dentry and inode.
- super_block, dentry and inode in aufs have private data which contains
  rwsem.
- the locking order for these rwsem is child-first.
- aufs specifies FS_RENAME_DOES_D_MOVE.

locking order in aufs_rename
+ down_read() for aufs sb
  protects sb from branch-add, delete.
+ two down_write()s for src and dest child
  protects them from other processes in aufs.
+ down_write() for the dst_parent.
+ decide the layer where we will operate, by comparing the index of
  layers where the targets exist and the layer attribute (ro, rw).
+ copyup the dest dir hierarchy if necessary, by repeating
  - dget_parent(), down/up_read() for the parent (in aufs)
  - mutex_lock() for the dir (on the layer) to mkdir the non-existing
    child dir on the layer and verify the parent-child relationship.
  - mkdir and setattr on the layer.
  - mutex_unlock() the dir on the layer.
+ test they are rename-able
  if it is a dir, it must be empty (logically) or must not have children
  on the multiple branches.
+ if src_parent and dst_parent differ, down_write both. up_write for
  dst_parent may be necessary to keep the "child-first" rule in aufs.

(from here the "sub-VFS" characteristic of aufs appears)
+ lock_rename() on the layer
  and verify the every relationships between child and parent.
+ test the src_child is deletable.
+ test the dst_child is add-able or deletable if it exists.
+ vfs_rename() on the layer or copyup src_child as a dst_child name.
+ unlock_rename() on the layer

(return to aufs world)
+ d_drop() dst_child if necessary.
+ d_move()
+ up_write() for src_parent and dst_parent
+ up_write() fot src_child and dst_child
+ up_read() for aufs sb

Strictly speaking, there are more things which aufs_rename() handles
such as inode attributes, whiteout, opaque-dir, internal pointers to the
object on the layer, temporary dir-name. But they are unrelated to the
locking order essentially. So I didn't describe about them.


Thank you reading this long mail.


J. R. Okajima
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ