lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1273840782-5920-1-git-send-email-walken@google.com>
Date:	Fri, 14 May 2010 05:39:32 -0700
From:	Michel Lespinasse <walken@...gle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Howells <dhowells@...hat.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mike Waychison <mikew@...gle.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Ying Han <yinghan@...gle.com>,
	Michel Lespinasse <walken@...gle.com>
Subject: [PATCH 00/10] V2: rwsem changes + down_read_unfair() proposal

I would like to sollicit comments regarding the following changes
against 2.6.34-rc7 + 91af708 (from V1 proposal) already applied.

The motivation for this change was some cluster monitoring software we
use at google; which reads /proc/<pid>/maps files for all running
processes. When the machines are under load, the mmap_sem is often
acquire for reads for long periods of time since do_page_fault() holds
it while doing disk accesses; and fair queueing behavior often ends up
in the monitoring software making little progress. By introducing
unfair behavior in a few selected places, are are able to let the
monitoring software make progress without impacting performance for
the rest of the system.

In general, I've made sure to implement this proposal without touching
the rwsem fast paths. Also, the first 8 patches of this series should
be of general applicability even if not taking the down_read_unfair()
changes, addressing minor issues such as situations where reader
threads can get blocked at the head of the waiting list even though
the rwsem is currently owned for reads.

Changes since v1:
- Keep the active count check when trying to wake readers in the up_xxxx()
  slow path (I had suppressed it in v1). However, I did try to lighten the
  check (this is patch 3 of the series).
- Added priviledge check before making use of unfair behavior in
  /proc/<pid>/exe and /proc/<pid>/maps files.
- Applied David Howell's many small suggestions (I hope I did not miss any).

Michel Lespinasse (10):
  x86 rwsem: minor cleanups
  rwsem: fully separate code pathes to wake writers vs readers
  rwsem: lighter active count checks when waking up readers
  rwsem: let RWSEM_WAITING_BIAS represent any number of waiting threads
  rwsem: wake queued readers when writer blocks on active read lock
  rwsem: smaller wrappers around rwsem_down_failed_common
  generic rwsem: implement down_read_unfair
  rwsem: down_read_unfair infrastructure support
  x86 rwsem: down_read_unfair implementation
  Use down_read_unfair() for /sys/<pid>/exe and /sys/<pid>/maps files

 arch/x86/include/asm/rwsem.h   |   70 ++++++++++++-----
 arch/x86/lib/rwsem_64.S        |   14 +++-
 arch/x86/lib/semaphore_32.S    |   21 +++++-
 fs/proc/base.c                 |    2 +-
 fs/proc/task_mmu.c             |    2 +-
 fs/proc/task_nommu.c           |    2 +-
 include/linux/capability.h     |    1 +
 include/linux/rwsem-spinlock.h |   10 ++-
 include/linux/rwsem.h          |   13 +++
 kernel/rwsem.c                 |   31 ++++++++
 lib/rwsem-spinlock.c           |   10 ++-
 lib/rwsem.c                    |  160 ++++++++++++++++++++++++++--------------
 12 files changed, 247 insertions(+), 89 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ