linux-kernel - Re: [PATCH RESEND v2 1/1] percpu_rw_semaphore: reimplement to not block the readers unnecessarily

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121112183814.GF2518@linux.vnet.ibm.com>
Date:	Mon, 12 Nov 2012 10:38:14 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Mikulas Patocka <mpatocka@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RESEND v2 1/1] percpu_rw_semaphore: reimplement to not
 block the readers unnecessarily

On Sun, Nov 11, 2012 at 04:45:09PM +0100, Oleg Nesterov wrote:
> On 11/09, Paul E. McKenney wrote:
> >
> > On Fri, Nov 09, 2012 at 07:10:48PM +0100, Oleg Nesterov wrote:
> > >
> > > 	static bool xxx(brw)
> > > 	{
> > > 		down_write(&brw->rw_sem);
> >
> > 		down_write_trylock()
> >
> > As you noted in your later email.  Presumably you return false if
> > the attempt to acquire it fails.
> 
> Yes, yes, thanks.
> 
> > > But first we should do other changes, I think. IMHO we should not do
> > > synchronize_sched() under mutex_lock() and this will add (a bit) more
> > > complications. We will see.
> >
> > Indeed, that does put considerable delay on the writers.  There is always
> > synchronize_sched_expedited(), I suppose.
> 
> I am not sure about synchronize_sched_expedited() (at least unconditionally),
> but: only the 1st down_write() needs  synchronize_, and up_write() do not
> need to sleep in synchronize_ at all.
> 
> To simplify, lets ignore the fact that the writers need to serialize with
> each other. IOW, the pseudo-code below is obviously deadly wrong and racy,
> just to illustrate the idea.
> 
> 1. We remove brw->writer_mutex and add "atomic_t writers_ctr".
> 
>    update_fast_ctr() uses atomic_read(brw->writers_ctr) == 0 instead
>    of !mutex_is_locked().
> 
> 2. down_write() does
> 
> 	if (atomic_add_return(brw->writers_ctr) == 1) {
> 		// first writer
> 		synchronize_sched();
> 		...
> 	} else {
> 		... XXX: wait for percpu_up_write() from the first writer ...
> 	}
> 
> 3. up_write() does
> 
> 	if (atomic_dec_unless_one(brw->writers_ctr)) {
> 		... wake up XXX writers above ...
> 		return;
> 	} else {
> 		// the last writer
> 		call_rcu_sched( func => { atomic_dec(brw->writers_ctr) } );
> 	}

Agreed, an asynchronous callback can be used to switch the readers
back onto the fastpath.  Of course, as you say, getting it all working
will require some care.  ;-)

> Once again, this all is racy, but hopefully the idea is clear:
> 
> 	- down_write(brw) sleeps in synchronize_sched() only if brw
> 	  has already switched back to fast-path-mode
> 
> 	- up_write() never sleeps in synchronize_sched(), it uses
> 	  call_rcu_sched() or wakes up the next writer.
> 
> Of course I am not sure this all worth the trouble, this should be discussed.
> (and, cough, I'd like to add the multi-writers mode which I'm afraid nobody
> will like) But I am not going to even try to do this until the current patch
> is applied, I need it to fix the bug in uprobes and I think the current code
> is "good enough". These changes can't help to speedup the readers, and the
> writers are slow/rare anyway.

Probably best to wait for multi-writers until there is a measurable need,
to be sure!  ;-)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/