lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1389362251-8128-1-git-send-email-tj@kernel.org>
Date:	Fri, 10 Jan 2014 08:57:17 -0500
From:	Tejun Heo <tj@...nel.org>
To:	gregkh@...uxfoundation.org
Cc:	linux-kernel@...r.kernel.org, schwidefsky@...ibm.com,
	heiko.carstens@...ibm.com, stern@...land.harvard.edu,
	JBottomley@...allels.com, bhelgaas@...gle.com
Subject: [PATCHSET v3 driver-core-next] kernfs, sysfs, driver-core: implement synchronous self-removal

Hello,

This is v3 of kernfs self-removal patchset.  v2 posting mistakenly
sent out slightly old set of patches, so the v3.  Sorry about the
noise.  Changes from the first take[L] are,

* Patches reordered so that more trivial ones are in the front.

* Deactivation split into a separate stage instead of being part of
  unlinking.  Deactivation is now exposed as kernfs API via four new
  functions - kernfs_{de|re}activate[_self]().  These functions can
  nest and allow implementation of "deactivate, lock subsys, try
  removal, unlock subsys, reactivate" sequence where removal may or
  may not succeed.  This will be used to convert cgroup to kernfs.
  (prototype seems happy with this API)

* As this means that deactivation can be temporary,
  kernfs_get_active() is updated to block if the node is deactivated
  but not removed.

* kernfs_remove_self() is now implemented using the new deactivation
  API.  Its behavior remains the same.

Original patch description follows.

kernfs / sysfs implement the "sever" semantic for userland accesses.
When a node is removed, no further userland operations are allowed and
the in-flight ones are drained before removal is finished.  This makes
policing post-mortem userland accesses trivial for its users;
unfortunately, this comes with a drawback - a node which tries to
delete oneself through one of its userland operations deadlocks.
Removal wants to drain the active access that the operation itself is
running on top of.

This currently is worked around in the sysfs layer using
sysfs_schedule_callback() which punts the actual removal to a work
item.  While making the operation asynchronous kinda works, it's a bit
cumbersome to use and its behavior isn't quite correct as the caller
has no way of telling when or even whether the operation is actually
complete.  If such self-removal is followed by another operation which
expects the removed name to be available, there's no way to make the
second operation reliable - e.g. something like "echo 1 > asdf/delete;
echo asdf > create_new_child" can't work properly.

This patchset improves kernfs removal path and implements
kernfs_remove_self() which is to be called from an on-going kernfs
operation and removes the self node.  The function can be called
concurrently and only one will return %true and all others will wait
until the winner's file operation is complete (not the
kernfs_remove_self() call itself but the enclosing file operation
which invoked the function).  This ensures that if there are multiple
concurrent "echo 1 > asdf/delete", all of them would finish only after
the whole store_delete() method is complete.

kernfs_remove_self() is exposed to upper layers through
sysfs_remove_file_self() and device_remove_file_self().  The existing
users of device_schedule_callback() are converted to use remove_self
and the unused async mechanism is removed.

This patchset contains the following 14 patches.

 0001-kernfs-fix-get_active-failure-handling-in-kernfs_seq.patch
 0002-kernfs-replace-kernfs_node-u.completion-with-kernfs_.patch
 0003-kernfs-remove-KERNFS_ACTIVE_REF-and-add-kernfs_lockd.patch
 0004-kernfs-remove-KERNFS_REMOVED.patch
 0005-kernfs-restructure-removal-path-to-fix-possible-prem.patch
 0006-kernfs-invoke-kernfs_unmap_bin_file-directly-from-__.patch
 0007-kernfs-remove-kernfs_addrm_cxt.patch
 0008-kernfs-make-kernfs_get_active-block-if-the-node-is-d.patch
 0009-kernfs-implement-kernfs_-de-re-activate-_self.patch
 0010-kernfs-sysfs-driver-core-implement-kernfs_remove_sel.patch
 0011-pci-use-device_remove_file_self-instead-of-device_sc.patch
 0012-scsi-use-device_remove_file_self-instead-of-device_s.patch
 0013-s390-use-device_remove_file_self-instead-of-device_s.patch
 0014-sysfs-driver-core-remove-unused-sysfs-device-_schedu.patch

0001 fixes -ENODEV failure handling in kernfs.  I *think* this could
be the fix for the issue Sasha reported with trinity fuzzying.  Sasha,
would it be possible to confirm whether the issue is reproducible with
this patch applied?

0002 replaces kernfs_node->u.completion with a hierarchy-wide
wait_queue_head.  This will be used to fix concurrent removal
behavior.

0003-0004 simplifies removal path to prepare for restructuring.

0005 fixes premature completion of node removal when multiple removers
are competing.  This shouldn't matter for the existing sysfs users.

0006-0007 cleans up removal path.  The size of kernfs_node gets
reduced by one pointer.

0008-0010 implement kernfs_{de|re}activate[_self](),
kernfs_remove_self() and friends.

0011-0014 convert the existing users of device_schedule_callback() to
device_remove_file_self() and remove now unused async mechanism.

After the changes, kernfs_node is shrunken by a pointer.
Unfortunately, the addition of deactivation API makes LOC go up by
above a hundred lines.  Oh well....

The patchset is on top of the current driver-core-next eb4c69033fd1
("Revert "kobject: introduce kobj_completion"") and also available in
the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-kernfs-suicide

diffstat follows.

 arch/s390/include/asm/ccwgroup.h |    1 
 arch/s390/pci/pci_sysfs.c        |   18 -
 drivers/base/core.c              |   50 +--
 drivers/pci/pci-sysfs.c          |   24 -
 drivers/s390/block/dcssblk.c     |   14 
 drivers/s390/cio/ccwgroup.c      |   26 +
 drivers/scsi/scsi_sysfs.c        |   15 -
 fs/kernfs/dir.c                  |  585 ++++++++++++++++++++++++++-------------
 fs/kernfs/file.c                 |   62 +++-
 fs/kernfs/kernfs-internal.h      |   17 -
 fs/kernfs/symlink.c              |    6 
 fs/sysfs/file.c                  |  115 +------
 include/linux/device.h           |   13 
 include/linux/kernfs.h           |   24 +
 include/linux/sysfs.h            |   16 -
 15 files changed, 555 insertions(+), 431 deletions(-)

Thanks.

--
tejun

[L] git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-kernfs-suicide
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ