lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121108134247.GB23425@redhat.com>
Date:	Thu, 8 Nov 2012 14:42:47 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <anton@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/1] percpu_rw_semaphore: reimplement to not block
	the readers unnecessarily

On 11/07, Mikulas Patocka wrote:
>
> On Wed, 7 Nov 2012, Oleg Nesterov wrote:
>
> > On 11/07, Mikulas Patocka wrote:
> > >
> > > It looks sensible.
> > >
> > > Here I'm sending an improvement of the patch - I changed it so that there
> > > are not two-level nested functions for the fast path and so that both
> > > percpu_down_read and percpu_up_read use the same piece of code (to reduce
> > > cache footprint).
> >
> > IOW, the only change is that you eliminate "static update_fast_ctr()"
> > and fold it into down/up_read which takes the additional argument.
> >
> > Honestly, personally I do not think this is better, but I won't argue.
> > I agree with everything but I guess we need the ack from Paul.
>
> If you look at generated assembly (for x86-64), the footprint of my patch
> is 78 bytes shared for both percpu_down_read and percpu_up_read.
>
> The footprint of your patch is 62 bytes for update_fast_ctr, 46 bytes for
> percpu_down_read and 20 bytes for percpu_up_read.

Still I think the code looks more clean this way, and personally I think
this is more important. Plus, this lessens the footprint for the caller
although I agree this is minor.

Please send the increnental patch if you wish, I won't argue. But note
that with the lockdep annotations (and I'll send the patch soon) the
code will look even worse. Either you need another "if (val > 0)" check
or you need to add rwsem_acquire_read/rwsem_release into .h

And if you do this change please also update the comments, they still
refer to update_fast_ctr() you folded into down_up ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ