lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Apr 2024 16:18:29 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org
Subject: Re: Finding open-coded workarounds for 1/2-byte cmpxchg()?

On Sat, Apr 06, 2024 at 01:00:35AM +0200, Julia Lawall wrote:
> 
> 
> On Thu, 4 Apr 2024, Paul E. McKenney wrote:
> 
> > Hello, Julia!
> >
> > I hope that things are going well for you and yours.
> >
> > TL;DR: Would you or one of your students be interested in looking for
> > some interesting code patterns involving cmpxchg?  If such patterns exist,
> > we would either need to provide fixes or to drop support for old systems.
> >
> > If this would be of interest, please read on!
> >
> > Arnd (CCed) and I are looking for open-coded emulations for one-byte
> > and two-byte cmpxchg().  Such emulations might be attempting to work
> > around the fact that not all architectures support those sizes, being
> > as they are only required to support four-byte cmpxchg() and, if they
> > are 64-bit architectures, eight-byte cmpxchg().
> >
> > There is a one-byte emulation in RCU (kernel/rcu/tasks.h), which looks
> > like this:
> >
> > ------------------------------------------------------------------------
> >
> > u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
> > {
> > 	union rcu_special ret;
> > 	union rcu_special trs_old = READ_ONCE(t->trc_reader_special);
> > 	union rcu_special trs_new = trs_old;
> >
> > 	if (trs_old.b.need_qs != old)
> > 		return trs_old.b.need_qs;
> > 	trs_new.b.need_qs = new;
> > 	ret.s = cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s);
> > 	return ret.b.need_qs;
> > }
> >
> > ------------------------------------------------------------------------
> >
> > An additional issue is posed by these, also in kernel/rcu/tasks.h:
> >
> > ------------------------------------------------------------------------
> >
> > 	if (trs.b.need_qs == (TRC_NEED_QS_CHECKED | TRC_NEED_QS)) {
> >
> > 	return smp_load_acquire(&t->trc_reader_special.b.need_qs);
> >
> > 	smp_store_release(&t->trc_reader_special.b.need_qs, v);
> >
> > ------------------------------------------------------------------------
> >
> > The additional issue is that these statements assume that each CPU
> > architecture has single-byte load and store instructions, which some of
> > the older Alpha systems do not.  Fortunately for me, Arnd was already
> > thinking in terms of removing support for these systems.
> >
> > But there are additional systems that do not support 16-bit loads and
> > stores.  So if there is a 16-bit counterpart to rcu_trc_cmpxchg_need_qs()
> > on a quantity that is also subject to 16-bit loads or stores, either
> > that function needs adjustment or a few more ancient systems need to
> > lose their Linux-kernel support.
> >
> > Again, is looking for this sort of thing something that you or one of
> > your students would be interested in?
> 
> Hello,
> 
> I tried, but without much success.  The following looks a little bit
> promising, eg the use of the variable name "want", but it's not clear that
> the rest of the context fits the pattern.

Thank you for digging into this!!!

> diff -u -p /home/julia/linux/net/sunrpc/xprtsock.c
> /tmp/nothing/net/sunrpc/xprtsock.c
> --- /home/julia/linux/net/sunrpc/xprtsock.c
> +++ /tmp/nothing/net/sunrpc/xprtsock.c
> @@ -690,12 +690,9 @@ xs_read_stream(struct sock_xprt *transpo
>  		if (ret <= 0)
>  			goto out_err;
>  		transport->recv.offset = ret;
> -		if (transport->recv.offset != want)
> -			return transport->recv.offset;

Agreed, though you are quite right that ->recv.copied and ->recv.offset
are different lengths.  But yes, as you sugggest below, there must be
a cmpxchg() of some type (cmpxchg(), cmpxchg_acquire(), ...) in the mix
somewhere.  Also, the cmpxchg() must be applied to a pointer to either
a 32-bit or a 64-bit quantity, but the change must be 16 bits (or 8 bits).

> The semantic patch in question was:
> 
> @r@
> expression olde;
> idexpression old;
> @@
> 
> if (olde != old) { ... return olde; }
> 
> @@
> expression newe != r.olde;
> idexpression nw;
> expression r.olde;
> idexpression r.old;
> @@
> 
> *if (olde != old) { ... return olde; }
> ...
> *newe = nw;
> ...
> *return newe;
> 
> The semantic patch doesn't include the cmpxchg.  I wasn't sure if that
> would always be present, or in what form.

It would be, but I am having trouble characterizing exactly what the
pattern would look like beyond "emulating a 16-bit cmpxchg() using either
a 32-bit cmpxchg() or a 64-bit cmpxchg()".  :-(

Thank you again, and something to think more about.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ