linux-kernel - Re: [PATCH 00/13] [RFC] Rust support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YH0yCTgL0raKrmYg@hirez.programming.kicks-ass.net>
Date:   Mon, 19 Apr 2021 09:32:25 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Wedson Almeida Filho <wedsonaf@...gle.com>, ojeda@...nel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        rust-for-linux@...r.kernel.org, linux-kbuild@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/13] [RFC] Rust support

On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
> On 16/04/21 09:09, Peter Zijlstra wrote:
> > Well, the obvious example would be seqlocks. C11 can't do them
> 
> Sure it can.  C11 requires annotating with (the equivalent of) READ_ONCE all
> reads of seqlock-protected fields, but the memory model supports seqlocks
> just fine.

How does that help?

IIRC there's two problems, one on each side the lock. On the write side
we have:

	seq++;
	smp_wmb();
	X = r;
	Y = r;
	smp_wmb();
	seq++;

Which C11 simply cannot do right because it does't have wmb. You end up
having to use seq_cst for the first wmb or make both X and Y (on top of
the last seq) a store-release, both options are sub-optimal.

On the read side you get:

	do {
	  s = seq;
	  smp_rmb();
	  r1 = X;
	  r2 = Y;
	  smp_rmb();
	} while ((s&1) || seq != s);

And then you get into trouble the last barrier, so the first seq load
can be load-acquire, after which the loads of X, Y come after, but you
need then to happen before the second seq load, for which you then need
seq_cst, or make X and Y load-acquire. Again, not optimal.

I have also seen *many* broken variants of it on the web. Some work on
x86 but are totally broken when you build them on LL/SC ARM64.