linux-kernel - Re: frequent lockups in 3.18rc4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUu56pLObcDxrNXO5uN_KV5ffDW3EkpLgZL2mh4-aofSQ@mail.gmail.com>
Date:	Thu, 20 Nov 2014 15:08:03 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Tejun Heo <tj@...nel.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>, Don Zickus <dzickus@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"the arch/x86 maintainers" <x86@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: frequent lockups in 3.18rc4

On Thu, Nov 20, 2014 at 3:05 PM, Tejun Heo <tj@...nel.org> wrote:
> Hello,
>
> On Thu, Nov 20, 2014 at 11:42:42PM +0100, Thomas Gleixner wrote:
>> On Thu, 20 Nov 2014, Tejun Heo wrote:
>> > On Thu, Nov 20, 2014 at 10:58:26PM +0100, Thomas Gleixner wrote:
>> > > It's completely undocumented behaviour, whether it has been that way
>> > > for ever or not. And I agree with Fredric, that it is insane. Actuallu
>> > > it's beyond insane, really.
>> >
>> > This is exactly the same for any address in the vmalloc space.
>>
>> I know, but I really was not aware of the fact that dynamically
>> allocated percpu stuff is vmalloc based and therefor exposed to the
>> same issues.
>>
>> The normal vmalloc space simply does not have the problems which are
>> generated by percpu allocations which have no documented access
>> restrictions.
>>
>> You created a special case and that special case is clever but not
>> very well thought out considering the use cases of percpu variables
>> and the completely undocumented limitations you introduced silently.
>>
>> Just admit it and dont try to educate me about trivial vmalloc
>> properties.
>
> Why are you always so overly dramatic?  How is this productive?  Sure,
> this could have been better but I missed it at the beginning and this
> is the first time I hear about this issue.  Shits happen and we fix
> them.
>
>> > That isn't enough tho.  What if the percpu allocated pointer gets
>> > passed to another CPU without task switching?  You'd at least need to
>> > send IPIs to all CPUs so that all the active PGDs get updated
>> > synchronously.
>>
>> You obviously did not even take the time to carefully read what I
>> wrote:
>>
>>    "Now after that increment the allocation side needs to wait for a
>>     scheduling cycle on all cpus (we have mechanisms for that)"
>>
>> That's exactly stating what you claim to be 'not enough'.
>
> Missed that.  Sorry.
>
>> > For the time being, we can make percpu accessors complain when
>> > called from nmi handlers so that the problematic ones can be easily
>> > identified.
>>
>> You should have done that in the very first place instead of letting
>> other people run into issues which you should have thought of from the
>> very beginning.
>
> Sure, it would have been better if I noticed that from the get-go, but
> I couldn't think of the NMI case that time and neither did anybody who
> reviewed the code.  It'd be awesome if we could have avoided it but it
> didn't go that way, so let's fix it.  Can we please stay technical?
>
> So, for now, all we need is adding nmi check in percpu accessors,
> right?
>

What's the issue with nmi?  Page faults are supposed to nest correctly
inside nmi, right?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/