lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 29 Jan 2013 14:06:42 -0800
From:	Tejun Heo <tj@...nel.org>
To:	Kent Overstreet <koverstreet@...gle.com>
Cc:	Oleg Nesterov <oleg@...hat.com>, srivatsa.bhat@...ux.vnet.ibm.com,
	rusty@...tcorp.com.au, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] generic dynamic per cpu refcounting

Hey, Kent.

On Tue, Jan 29, 2013 at 01:45:37PM -0800, Kent Overstreet wrote:
> It would definitely be cleaner if the global counter was also 32 bits (I
> probably should've done it that way at first) but it works with it being
> bigger _provided the sum of the percpu counters is sign extended_ when
> collecting them up at the end.
> 
> With the bias though we kind of want the global counter to be bigger...
> the bias needs to fit in the global counter.

Yeah, the size being 64bit is fine as long as the summation and the
result is done in 32bits.

> > I don't know.  I think it's a correctness issue.  If you
> > over/underflow into global counter, you need buffer space to be
> > (practically) correct. 
> 
> Not quite sure what you mean... 

If you compute w/o sign extension, it becomes buggy.

> It doesn't really make sense to use more than 32 bits for the global sum
> (ignoring the bias value... that complicates it) but it does work, with
> the sign extending. I haven't figured out a good way of explaining why
> though... I should probably look at what happens without it again. (I
> missed the sign extending when I first coded it, had to debug that).

I think it would be easier to understand if we use only the lower
32bits for the actual counting.

> > There really is no point in adding complexity which isn't needed.  It
> > slows down the code unnecessarily (however minute) and adds
> > maintenance overhead.  Why do that when there's no need?
> 
> Well, like I said I want this to be a drop in replacement for atomic_t
> anyone can use without having to think about the potential overhead.
> 
> If it really does turn out there aren't enough uses for it to be worth
> it I'll be ok with ripping out the dynamic alloc... but the slowdown is
> pretty damn minute (one instruction in the fast path) and the additional
> complexity is fairly minor and quite self contained so I'm just not
> seeing the urgency.

I really don't think this can be drop-in replacement given the
significant overhead on base ref put and you build up complexity with
use cases not the other way around.  When you're looking at each
specific implementation, it might not look like much but it really
sucks to have a lot of unneeded / unused complexities around.  It
raises maintenance overhead for no good reason.  Like you said, it's
not like extending the design is difficult if we find out that the
dynamic allocation is actually useful, and there just is no reason to
do it the other way around.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ