linux-kernel - Re: [PATCH 1/2] brw_mutex: big read-write mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121024030845.GT4291@dastard>
Date:	Wed, 24 Oct 2012 14:08:45 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Oleg Nesterov <oleg@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/2] brw_mutex: big read-write mutex

On Fri, Oct 19, 2012 at 06:54:41PM -0400, Mikulas Patocka wrote:
> 
> 
> On Fri, 19 Oct 2012, Peter Zijlstra wrote:
> 
> > > Yes, I tried this approach - it involves doing LOCK instruction on read 
> > > lock, remembering the cpu and doing another LOCK instruction on read 
> > > unlock (which will hopefully be on the same CPU, so no cacheline bouncing 
> > > happens in the common case). It was slower than the approach without any 
> > > LOCK instructions (43.3 seconds seconds for the implementation with 
> > > per-cpu LOCKed access, 42.7 seconds for this implementation without atomic 
> > > instruction; the benchmark involved doing 512-byte direct-io reads and 
> > > writes on a ramdisk with 8 processes on 8-core machine).
> > 
> > So why is that a problem? Surely that's already tons better then what
> > you've currently got.
> 
> Percpu rw-semaphores do not improve performance at all. I put them there 
> to avoid performance regression, not to improve performance.
> 
> All Linux kernels have a race condition - when you change block size of a 
> block device and you read or write the device at the same time, a crash 
> may happen. This bug is there since ever. Recently, this bug started to 
> cause major trouble - multiple high profile business sites report crashes 
> because of this race condition.
>
> You can fix this race by using a read lock around I/O paths and write lock 
> around block size changing, but normal rw semaphore cause cache line 
> bouncing when taken for read by multiple processors and I/O performance 
> degradation because of it is measurable.

This doesn't sound like a new problem.  Hasn't this global access,
single modifier exclusion problem been solved before in the VFS?
e.g. mnt_want_write()/mnt_make_readonly()

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/