linux-kernel - Re: [PATCH 1/2] brw_mutex: big read-write mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1350668451.2768.60.camel@twins>
Date:	Fri, 19 Oct 2012 19:40:51 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	Oleg Nesterov <oleg@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH 1/2] brw_mutex: big read-write mutex

On Fri, 2012-10-19 at 11:32 -0400, Mikulas Patocka wrote:

> So if you can do an alternative implementation without RCU, show it.

Uhm,,. no that's not how it works. You just don't push through crap like
this and then demand someone else does it better.

But using preempt_{disable,enable} and using synchronize_sched() would
be better (for PREEMPT_RCU) although it wouldn't fix anything
fundamental.

>  The 
> goal is - there should be no LOCK instructions on the read path and as 
> few barriers as possible.

Fine goal, although somewhat arch specific. Also note that there's a
relation between atomics and memory barriers, one isn't necessarily
worse than the other, they all require synchronization of sorts.  

> > So did you consider keeping the inc/dec on the same per-cpu variable?
> > Yes this adds a potential remote access to dec and requires you to use
> > atomics, but I would not be surprised if the inc/dec were mostly on the
> > same cpu most of the times -- which might be plenty fast for what you
> > want.
> 
> Yes, I tried this approach - it involves doing LOCK instruction on read 
> lock, remembering the cpu and doing another LOCK instruction on read 
> unlock (which will hopefully be on the same CPU, so no cacheline bouncing 
> happens in the common case). It was slower than the approach without any 
> LOCK instructions (43.3 seconds seconds for the implementation with 
> per-cpu LOCKed access, 42.7 seconds for this implementation without atomic 
> instruction; the benchmark involved doing 512-byte direct-io reads and 
> writes on a ramdisk with 8 processes on 8-core machine).

So why is that a problem? Surely that's already tons better then what
you've currently got. Also uncontended LOCK is something all x86 vendors
keep optimizing, they'll have to if they want to keep adding CPUs.

> > If you've got coherent per-cpu counts, you can better do the
> > waitqueue/wake condition for write_down.
> 
> synchronize_rcu() is way slower than msleep(1) - so I don't see a reason 
> why should it be complicated to avoid msleep(1).

Its not about slow, a polling write side is just fscking ugly. Also, if
you're already polling that *_sync() is bloody pointless.

> > It might also make sense to do away with the mutex, there's no point in
> > serializing the wakeups in the p->locked case of down_read.
> 
> The advantage of a mutex is that it is already protected against 
> starvation. If I replace the mutex with a wait queue and retry, there is 
> no starvation protection.

Which starvation? writer-writer order? What stops you from adding a list
there yourself? Also, writers had better be rare for this thing, so who
gives a crap?

> > Furthermore,
> > p->locked seems a complete duplicate of the mutex state, so removing the
> > mutex also removes that duplication.
> 
> We could replace if (p->locked) with if (mutex_is_locked(p->mtx))

Quite so.. 

You're also still lacking lockdep annotations...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/