linux-kernel - Re: [PATCH V2] MIPS: implement smp_cond_load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180710161423.GS3593@linux.vnet.ibm.com>
Date:   Tue, 10 Jul 2018 09:14:23 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     陈华才 <chenhc@...ote.com>,
        Paul Burton <paul.burton@...s.com>,
        Ralf Baechle <ralf@...ux-mips.org>,
        James Hogan <jhogan@...nel.org>,
        linux-mips <linux-mips@...ux-mips.org>,
        Fuxin Zhang <zhangfx@...ote.com>,
        wuzhangjin <wuzhangjin@...il.com>,
        stable <stable@...r.kernel.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Andrea Parri <andrea.parri@...rulasolutions.com>,
        Will Deacon <will.deacon@....com>,
        Boqun Feng <boqun.feng@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Akira Yokosawa <akiyks@...il.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH V2] MIPS: implement smp_cond_load_acquire() for Loongson-3

On Tue, Jul 10, 2018 at 02:17:27PM +0200, Peter Zijlstra wrote:
> 
> Please!! Learn to use email.
> 
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
> 
> Also, wrap non-quoted lines to 78 characters.
> 
> On Tue, Jul 10, 2018 at 07:45:22PM +0800, 陈华才 wrote:
> > I'm afraid that you have missing something......
> > 
> > Firstly, our previous conclusion (READ_ONCE need a barrier to avoid
> > 'reads prioritised over writes') is totally wrong. So define
> > cpu_relax() to smp_mb() like ARM11MPCore is incorrect, even if it can
> > 'solve' Loongson's problem.  Secondly, I think the real problem is
> > like this:
> 
> >  1, CPU0 set the lock to 0, then do something;
> >  2, While CPU0 is doing something, CPU1 set the flag to 1 with
> >     WRITE_ONCE(), and then wait the lock become to 1 with a READ_ONCE()
> >     loop;
> >  3, After CPU0 complete its work, it wait the flag become to 1, and if
> >     so then set the lock to 1;
> >  4, If the lock becomes to 1, CPU1 will leave the READ_ONCE() loop.

Are there specific loops in the kernel whose conditions are controlled
by READ_ONCE() that don't contain cpu_relax(), smp_mb(), etc.?  One
way to find them given your description of your hardware is to make
cpu_relax() be smp_mb() as Peter suggests, and then run tests to find
the problems.

Or have you already done this?

							Thanx, Paul

> > If without SFB, everything is OK. But with SFB in step 2, a
> > READ_ONCE() loop is right after WRITE_ONCE(), which makes the flag
> > cached in SFB (so be invisible by other CPUs) for ever, then both CPU0
> > and CPU1 wait for ever.
> 
> Sure.. we all got that far. And no, this isn't the _real_ problem. This
> is a manifestation of the problem.
> 
> The problem is that your SFB is broken (per the Linux requirements). We
> require that stores will become visible. That is, they must not
> indefinitely (for whatever reason) stay in the store buffer.
> 
> > I don't think this is a hardware bug, in design, SFB will flushed to
> > L1 cache in three cases:
> 
> > 1, data in SFB is full (be a complete cache line);
> > 2, there is a subsequent read access in the same cache line;
> > 3, a 'sync' instruction is executed.
> 
> And I think this _is_ a hardware bug. You just designed the bug instead
> of it being by accident.
> 
> > In this case, there is no other memory access (read or write) between
> > WRITE_ONCE() and READ_ONCE() loop. So Case 1 and Case 2 will not
> > happen, and the only way to make the flag be visible is wbflush
> > (wbflush is sync in Loongson's case).
> >
> > I think this problem is not only happens on Loongson, but will happen
> > on other CPUs which have write buffer (unless the write buffer has a
> > 4th case to be flushed).
> 
> It doesn't happen an _any_ other architecture except that dodgy
> ARM11MPCore part. Linux hard relies on stores to become available
> _eventually_.
> 
> Still, even with the rules above, the best work-around is still the very
> same cpu_relax() hack.