linux-kernel - Re: [PATCH v2 8/9] atomic,x86: Alternative atomic_*

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YbeC9ySoLlfKOZPq@elver.google.com>
Date:   Mon, 13 Dec 2021 18:29:27 +0100
From:   Marco Elver <elver@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     will@...nel.org, boqun.feng@...il.com,
        linux-kernel@...r.kernel.org, x86@...nel.org, mark.rutland@....com,
        keescook@...omium.org, hch@...radead.org,
        torvalds@...ux-foundation.org, axboe@...nel.dk
Subject: Re: [PATCH v2 8/9] atomic,x86: Alternative atomic_*_overflow() scheme

On Mon, Dec 13, 2021 at 05:43PM +0100, Peter Zijlstra wrote:
> On Fri, Dec 10, 2021 at 05:16:26PM +0100, Peter Zijlstra wrote:
> > Shift the overflow range from [0,INT_MIN] to [-1,INT_MIN], this allows
> > optimizing atomic_inc_overflow() to use "jle" to detect increment
> > from free-or-negative (with -1 being the new free and it's increment
> > being 0 which sets ZF).
> > 
> > This then obviously changes atomic_dec*_overflow() since it must now
> > detect the 0->-1 transition rather than the 1->0. Luckily this is
> > reflected in the carry flag (since we need to borrow to decrement 0).
> > However this means decrement must now use the SUB instruction with a
> > literal, since DEC doesn't set CF.
> > 
> > This then gives the following primitives:
> > 
> > [-1, INT_MIN]					[0, INT_MIN]
> > 
> > inc()						inc()
> > 	lock inc %[var]					mov       $-1, %[reg]
> > 	jle	error-free-or-negative			lock xadd %[reg], %[var]
> > 							test      %[reg], %[reg]
> > 							jle	  error-zero-or-negative
> > 
> > dec()                                           dec()
> > 	lock sub $1, %[var]				lock dec %[var]
> > 	jc	error-to-free				jle	error-zero-or-negative
> > 	jl	error-from-negative
> > 
> > dec_and_test()                                  dec_and_test()
> > 	lock sub $1, %[var]				lock dec %[var]
> > 	jc	do-free					jl	error-from-negative
> > 	jl	error-from-negative			je	do-free
> > 
> > Make sure to set ATOMIC_OVERFLOW_OFFSET to 1 such that other code
> > interacting with these primitives can re-center 0.
> 
> So Marco was expressing doubt about this exact interface for the
> atomic_*_overflow() functions, since it's extremely easy to get the
> whole ATOMIC_OVERFLOW_OFFSET thing wrong.
> 
> Since the current ops are strictly those that require inline asm, the
> interface is fairly incomplete, which forces anybody who's going to use
> these to provide whatever is missing. eg. atomic_inc_not_zero_overflow()
> for example.
> 
> Another proposal had the user supply the offset as a compile time
> constant to the function itself, raising a build-bug for any unsupported
> offset. This would ensure the caller is at least aware of any non-zero
> offset... still not going to really be dummy proof either.

In the spirit of making the interface harder to misuse, this would at
least ensure that non-refcount_t code that wants to use
atomic_*overflow() is 100% aware of this. Which is half of the issue I
think.

The other half is code using the actual values, and ensuring it's offset
correctly. This might also be an issue in e.g. refcount_t, if someone
wants to modify or extend it, although it's easy enough to audit and
review in such central data structures as refcount_t.

> Alternatively we could provide a more complete set of ops and/or a whole
> new type, but... I'm not sure about that either.
> 
> I suppose I can try and do something like refcount_overflow_t and
> implement the whole current refcount API in terms of that. Basically
> everywhere we currently do refcount_warn_saturate() would become goto
> label.
> 
> And then refcount_t could be a thin wrapper on top of that. But urgh...
> lots of work, very little gain.
> 
> So what do we do? Keep things as is, and think about it again once we
> got the first bug in hand, preemptively add a few ops or go completely
> overboard?
> 
> Obviously I'm all for keeping things as is (less work for this lazy
> bastard etc..)

I think an entirely new type might be overkill, but at the very least
designing the interface such that it's

	A. either impossible to not notice the fact atomic_*overflow()
	   works in terms of offsets, or
	B. not even exposing this detail.

#A can be achieved with supplying offsets to atomic_*overflow(). #B can
be achieved with new wrapper types -- however, if we somehow ensure that
refcount_t remains the only user of atomic_*overflow(), I'd consider
refcount_t a wrapper type already, so no need to add more.

Regarding the interface, it'd be nice if it could be made harder to
misuse, but I don't know how much it'll buy over what it is right now,
since we don't even know if there'll be other users of this yet.

But here are some more issues I just thought of:

	1. A minor issue is inspecting raw values, like in register
	   dumps. refcount_t will now look different on x86 vs. other
	   architectures.

	2. Yet another potentially larger issue is if some code
	   kmalloc()s some structs containing refcount_t, and relies on
	   GFP_ZERO (kzalloc()) to initialize their data assuming that a
	   freshly initialized refcount_t contains 0.

I think #1 is a cosmetic issue, which we might be able to live with.

However, I have absolutely no idea how we can audit or even prevent #2
from happening. With #2 in mind, and with C's lack of enforcing any kind
of "constructors", the interface and implementation we end up with is
going to result in near-impossible to debug issues sooner or later.

Thanks,
-- Marco