linux-kernel - Re: Finding open-coded workarounds for 1/2-byte cmpxchg()?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d0c0e4c6-0791-448b-90ec-0bb4edcb8518@paulmck-laptop>
Date: Tue, 9 Apr 2024 12:25:38 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org
Subject: Re: Finding open-coded workarounds for 1/2-byte cmpxchg()?

On Fri, Apr 05, 2024 at 04:18:29PM -0700, Paul E. McKenney wrote:
> On Sat, Apr 06, 2024 at 01:00:35AM +0200, Julia Lawall wrote:
> > 
> > 
> > On Thu, 4 Apr 2024, Paul E. McKenney wrote:
> > 
> > > Hello, Julia!
> > >
> > > I hope that things are going well for you and yours.
> > >
> > > TL;DR: Would you or one of your students be interested in looking for
> > > some interesting code patterns involving cmpxchg?  If such patterns exist,
> > > we would either need to provide fixes or to drop support for old systems.
> > >
> > > If this would be of interest, please read on!
> > >
> > > Arnd (CCed) and I are looking for open-coded emulations for one-byte
> > > and two-byte cmpxchg().  Such emulations might be attempting to work
> > > around the fact that not all architectures support those sizes, being
> > > as they are only required to support four-byte cmpxchg() and, if they
> > > are 64-bit architectures, eight-byte cmpxchg().
> > >
> > > There is a one-byte emulation in RCU (kernel/rcu/tasks.h), which looks
> > > like this:
> > >
> > > ------------------------------------------------------------------------
> > >
> > > u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
> > > {
> > > 	union rcu_special ret;
> > > 	union rcu_special trs_old = READ_ONCE(t->trc_reader_special);
> > > 	union rcu_special trs_new = trs_old;
> > >
> > > 	if (trs_old.b.need_qs != old)
> > > 		return trs_old.b.need_qs;
> > > 	trs_new.b.need_qs = new;
> > > 	ret.s = cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s);
> > > 	return ret.b.need_qs;
> > > }
> > >
> > > ------------------------------------------------------------------------
> > >
> > > An additional issue is posed by these, also in kernel/rcu/tasks.h:
> > >
> > > ------------------------------------------------------------------------
> > >
> > > 	if (trs.b.need_qs == (TRC_NEED_QS_CHECKED | TRC_NEED_QS)) {
> > >
> > > 	return smp_load_acquire(&t->trc_reader_special.b.need_qs);
> > >
> > > 	smp_store_release(&t->trc_reader_special.b.need_qs, v);
> > >
> > > ------------------------------------------------------------------------
> > >
> > > The additional issue is that these statements assume that each CPU
> > > architecture has single-byte load and store instructions, which some of
> > > the older Alpha systems do not.  Fortunately for me, Arnd was already
> > > thinking in terms of removing support for these systems.
> > >
> > > But there are additional systems that do not support 16-bit loads and
> > > stores.  So if there is a 16-bit counterpart to rcu_trc_cmpxchg_need_qs()
> > > on a quantity that is also subject to 16-bit loads or stores, either
> > > that function needs adjustment or a few more ancient systems need to
> > > lose their Linux-kernel support.
> > >
> > > Again, is looking for this sort of thing something that you or one of
> > > your students would be interested in?
> > 
> > Hello,
> > 
> > I tried, but without much success.  The following looks a little bit
> > promising, eg the use of the variable name "want", but it's not clear that
> > the rest of the context fits the pattern.
> 
> Thank you for digging into this!!!
> 
> > diff -u -p /home/julia/linux/net/sunrpc/xprtsock.c
> > /tmp/nothing/net/sunrpc/xprtsock.c
> > --- /home/julia/linux/net/sunrpc/xprtsock.c
> > +++ /tmp/nothing/net/sunrpc/xprtsock.c
> > @@ -690,12 +690,9 @@ xs_read_stream(struct sock_xprt *transpo
> >  		if (ret <= 0)
> >  			goto out_err;
> >  		transport->recv.offset = ret;
> > -		if (transport->recv.offset != want)
> > -			return transport->recv.offset;
> 
> Agreed, though you are quite right that ->recv.copied and ->recv.offset
> are different lengths.  But yes, as you sugggest below, there must be
> a cmpxchg() of some type (cmpxchg(), cmpxchg_acquire(), ...) in the mix
> somewhere.  Also, the cmpxchg() must be applied to a pointer to either
> a 32-bit or a 64-bit quantity, but the change must be 16 bits (or 8 bits).
> 
> > The semantic patch in question was:
> > 
> > @r@
> > expression olde;
> > idexpression old;
> > @@
> > 
> > if (olde != old) { ... return olde; }
> > 
> > @@
> > expression newe != r.olde;
> > idexpression nw;
> > expression r.olde;
> > idexpression r.old;
> > @@
> > 
> > *if (olde != old) { ... return olde; }
> > ...
> > *newe = nw;
> > ...
> > *return newe;
> > 
> > The semantic patch doesn't include the cmpxchg.  I wasn't sure if that
> > would always be present, or in what form.
> 
> It would be, but I am having trouble characterizing exactly what the
> pattern would look like beyond "emulating a 16-bit cmpxchg() using either
> a 32-bit cmpxchg() or a 64-bit cmpxchg()".  :-(
> 
> Thank you again, and something to think more about.

I took the crude approach of looking at all of the cmpxchg*() invocations,
discarding those that were clearly not an issue.  Here are the close calls
that I found:

o	drivers/misc/genwqe/card_ddcb.c enqueue_ddcb() does work against
	a union that has 32-bit, 16-bit, and eight-bit members, but as
	far as I can see the ->icrc_16 member is not used.  But this
	might be an accident waiting to happen.  Or maybe this driver
	is used only by architectures with a full set of cmpxchg sizes.

	The 8-bit ->hsi and ->shi fields are used for debug output, which
	should be harmless.

	Ditto __genwqe_purge_ddcb() that same file.

o	drivers/platform/surface/aggregator/controller.c ssh_seq_next()
	does an 8-bit cmpxchg().

o	drivers/platform/surface/aggregator/controller.c ssh_rqid_next()
	does a 16-bit cmpxchg().

o	kernel/locking/qspinlock_paravirt.h pv_wait_node() does
	an 8-bit cmpxchg().  As does pv_kick_node() in that same
	file.  And __pv_queued_spin_unlock().  And, as Arnd noted,
	trylock_clear_pending() and pv_hybrid_queued_unfair_trylock()
	in that same file do 16-bit cmpxchg_acquire().

o	net/rxrpc/io_thread.c rxrpc_input_packet_on_conn() is strange
	in that it supplies 16-bit old and new fields to a cmpxchg()
	of a 32-bit quantity, but it is quite possible that this would
	be a 16-bit quantity if permitted in core code.  This is in an
	rxrpc_connection structure.

There are no doubt some false negatives omitted from this list, but
there are a few places that use or would like to use 8-bit and 16-bit
cmpxchg*().

							Thanx, Paul