[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <8D582966-08B6-46F2-B12A-BC33F7EF0EB6@amacapital.net>
Date: Fri, 8 Sep 2017 18:39:28 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andy Lutomirski <luto@...nel.org>, Borislav Petkov <bp@...en8.de>,
Markus Trippelsdorf <markus@...ppelsdorf.de>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Tom Lendacky <thomas.lendacky@....com>
Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf
> On Sep 8, 2017, at 6:05 PM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
>> On Fri, Sep 8, 2017 at 5:00 PM, Andy Lutomirski <luto@...nel.org> wrote:
>>
>> I'm not convinced. The SDM says (Vol 3, 11.3, under WC):
>>
>> If the WC buffer is partially filled, the writes may be delayed until
>> the next occurrence of a serializing event; such as, an SFENCE or
>> MFENCE instruction, CPUID execution, a read or write to uncached
>> memory, an interrupt occurrence, or a LOCK instruction execution.
>>
>> Thanks, Intel, for definiing "serializing event" differently here than
>> anywhere else in the whole manual.
>
> Yeah, it's really badly defined. Ok, maybe a locked instruction does
> actually wait for it.. It should be invisible to anything, regardless.
>
>> 1. The kernel wants to reclaim a page of normal memory, so it unmaps
>> it and flushes. Another CPU has an entry for that page in its WC
>> buffer. I don't think we care whether the flush causes the WC write
>> to really hit RAM because it's unobservable -- we just need to make
>> sure it is ordered, as seen by software, before the flush operation
>> completes. From the quote above, I think we're okay here.
>
> Agreed.
>
>> 2. The kernel is unmapping some IO memory (e.g. a GPU command buffer).
>> It wants a guarantee that, when flush_tlb_mm_range returns, all CPUs
>> are really done writing to it. Here I'm less convinced. The SDM
>> quote certainly suggests to me that we have a promise that the WC
>> write has *started* before flush_tlb_mm_range returns, but I'm not
>> sure I believe that it's guaranteed to have retired.
>
> If others have writable TLB entries, what keeps them from just
> continuing to write for a long time afterwards?
Whoever unmaps the resource by kicking out their drm fd? I admit I'm just trying to think of the worst case.
>
>> I'd prefer to leave it as is except on the buggy AMD CPUs, though,
>> since the current code is nice and fast.
>
> So is there a patch to detect the 383 erratum and serialize for those?
> I may have missed that part.
>
The patch is in my head. It's imaginarily attached to this email.
> Linus
Powered by blists - more mailing lists