linux-kernel - Re: [PATCH 04/21] Generic percpu refcounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130529011135.GA2874@mtj.dyndns.org>
Date:	Wed, 29 May 2013 10:11:35 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Kent Overstreet <koverstreet@...gle.com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-aio@...ck.org, akpm@...ux-foundation.org,
	Zach Brown <zab@...hat.com>, Felipe Balbi <balbi@...com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Mark Fasheh <mfasheh@...e.com>,
	Joel Becker <jlbec@...lplan.org>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Jens Axboe <axboe@...nel.dk>,
	Asai Thambi S P <asamymuthupa@...ron.com>,
	Selvan Mani <smani@...ron.com>,
	Sam Bradshaw <sbradshaw@...ron.com>,
	Jeff Moyer <jmoyer@...hat.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Benjamin LaHaise <bcrl@...ck.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH 04/21] Generic percpu refcounting

Yo,

On Tue, May 28, 2013 at 04:47:28PM -0700, Kent Overstreet wrote:
> > It'd be great if that is explained clearly in more intuitive way.  The
> > only actual explanation above is "modular arithmatic is commutative"
> > which is a very compact way to put it and I really think it deserves
> > an easier explanation.
> 
> I'm not sure I know of any good way of explaining it intuitively, but
> here's this at least...
> 
>  * (More precisely: because moduler arithmatic is commutative the sum of all the
>  * pcpu_count vars will be equal to what it would have been if all the gets and
>  * puts were done to a single integer, even if some of the percpu integers
>  * overflow or underflow).

Yeah, that's much better.

> And we can't do more puts than there have been gets - because the sum
> can't be negative. So the most puts() we can do at any given time is the
> real count, or sum of the percpu ref and atomic_t.
> 
> Therefore, the amount the atomic_t can go negative is bounded by the
> maximum value of the refcount.

Ah, okay, I thought you were collecting the percpu counters directly
into the global counter.  You're staging it into a temp counter and
then adding it into the global counter after the summing is complete.
Yeap, that should be fine then.  It'd be worthwhile to document the
importance of not adding it directly to the global counter.

> > I probably should have made it clearer.  Sorry about that.  tryget()
> > is fine.  I was curious about count() as it's always a bit dangerous a
> > query interface which is racy and can return something unexpected like
> > false zero or underflowed refcnt.
> 
> Yeah, it is, it was intended just for the module code where it's only
> used for the value lsmod shows.

Let's document so then and limit the range returned.  We require the
refcnt to be alive and it'd be a good way to both protect from and
deter creative usages.

> > Let's just have percpu_ref_kill(ref, release) which puts the base ref
> > and invokes release whenever it's done.
> 
> Release has to be stored in struct percpu_ref() so it can be invoked
> after a call_rcu() (percpu_ref_kill -> call_rcu() ->
> percpu_ref_kill_rcu() -> percpu_ref_put()) so I'm passing it to
> percpu_ref_init(), but yeah.

Yeah, I'm a bit torn about where to put the release function.  For me,
as we have an API which is dedicated to killing a refcnt, it does make
sense to put it there but it's really in the realm of bikeshedding so
choose whatever you wanna choose.

Thanks!

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/