linux-kernel - Re: Do we need to correct barriering in circular-buffers.rst?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKwvOd=pZYiozmGv+DVpzJ1u9_0k4CXb3M1EAcu22DQF+bW0fA@mail.gmail.com>
Date:   Fri, 27 Sep 2019 13:43:18 -0700
From:   Nick Desaulniers <ndesaulniers@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Andrea Parri <parri.andrea@...il.com>,
        David Howells <dhowells@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Will Deacon <will@...nel.org>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Mark Rutland <mark.rutland@....com>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        jose.marchesi@...cle.com
Subject: Re: Do we need to correct barriering in circular-buffers.rst?

On Fri, Sep 27, 2019 at 5:49 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Fri, Sep 27, 2019 at 11:51:07AM +0200, Andrea Parri wrote:
>
> > For the record, the LKMM doesn't currently model "order" derived from
> > control dependencies to a _plain_ access (even if the plain access is
> > a write): in particular, the following is racy (as far as the current
> > LKMM is concerned):
> >
> > C rb
> >
> > { }
> >
> > P0(int *tail, int *data, int *head)
> > {
> >       if (READ_ONCE(*tail)) {
> >               *data = 1;
> >               smp_wmb();
> >               WRITE_ONCE(*head, 1);
> >       }
> > }
> >
> > P1(int *tail, int *data, int *head)
> > {
> >       int r0;
> >       int r1;
> >
> >       r0 = READ_ONCE(*head);
> >       smp_rmb();
> >       r1 = *data;
> >       smp_mb();
> >       WRITE_ONCE(*tail, 1);
> > }
> >
> > Replacing the plain "*data = 1" with "WRITE_ONCE(*data, 1)" (or doing
> > s/READ_ONCE(*tail)/smp_load_acquire(tail)) suffices to avoid the race.
> > Maybe I'm short of imagination this morning...  but I can't currently
> > see how the compiler could "break" the above scenario.
>
> The compiler; if sufficiently smart; is 'allowed' to change P0 into
> something terrible like:
>
>         *data = 1;
>         if (*tail) {
>                 smp_wmb();
>                 *head = 1;
>         } else
>                 *data = 0;

I don't think so.  This snippet has different side effects than P0.
P0 never assigned *data to zero, this snippet does.  P0 *may* assign
*data to 1.  This snippet will unconditionally assign to *data,
conditionally 1 or 0.  I think the REVERSE transform (from your
snippet to P0) would actually be legal, but IANALL (I am not a
language lawyer; haven't yet passed the BAR).

>
>
> (assuming it knows *data was 0 from a prior store or something)

Oh, in that case I'm less sure (I still don't think so, but I would
love to be proven wrong, preferably with a godbolt link).  I think the
best would be to share a godbolt.org link to a case that's clearly
broken, or cite the relevant part of the ISO C standard (which itself
leaves room for interpretation), otherwise the discussion is too
hypothetical.  Those two things are single-handedly the best way to
communicate with compiler folks.

>
> Using WRITE_ONCE() defeats this because volatile indicates external
> visibility.

Could data be declared as a pointer to volatile qualified int?

>
> > I also didn't spend much time thinking about it.  memory-barriers.txt
> > has a section "CONTROL DEPENDENCIES" dedicated to "alerting developers
> > using control dependencies for ordering".  That's quite a long section
> > (and probably still incomplete); the last paragraph summarizes:  ;-)
>
> Barring LTO the above works for perf because of inter-translation-unit
> function calls, which imply a compiler barrier.
>
> Now, when the compiler inlines, it looses that sync point (and thereby
> subtlely changes semantics from the non-inline variant). I suspect LTO
> does the same and can cause subtle breakage through this transformation.

Do you have a bug report or godbolt link for the above?  I trust that
you're familiar enough with the issue to be able to quickly reproduce
it?  These descriptions of problems are difficult for me to picture in
code or generated code, and when I try to read through
memory-barriers.txt my eyes start to glaze over (then something else
catches fire and I have to go put that out).  Having a concise test
case I think would better illustrate potential issues with LTO that
we'd then be able to focus on trying to fix/support.

We definitely have heavy hitting language lawyers and our LTO folks
are super sharp; I just don't have the necessary compiler experience
just yet to be as helpful in these discussions as we need but I'm
happy to bring them cases that don't work for the kernel and drive
their resolution.

>
> > (*) Compilers do not understand control dependencies.  It is therefore
> >     your job to ensure that they do not break your code.
>
> It is one the list of things I want to talk about when I finally get
> relevant GCC and LLVM people in the same room ;-)
>
> Ideally the compiler can be taught to recognise conditionals dependent
> on 'volatile' loads and disallow problematic transformations around
> them.
>
> I've added Nick (clang) and Jose (GCC) on Cc, hopefully they can help
> find the right people for us.
-- 
Thanks,
~Nick Desaulniers