linux-kernel - Re: frequent lockups in 3.18rc4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUX8O9u7+cU_32R+36t=jeB2dvj3_fDGD+uvhTGYZATuA@mail.gmail.com>
Date:	Wed, 19 Nov 2014 14:59:01 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>, Don Zickus <dzickus@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	"the arch/x86 maintainers" <x86@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: frequent lockups in 3.18rc4

On Wed, Nov 19, 2014 at 2:56 PM, Frederic Weisbecker <fweisbec@...il.com> wrote:
> On Wed, Nov 19, 2014 at 10:56:26PM +0100, Thomas Gleixner wrote:
>> On Wed, 19 Nov 2014, Frederic Weisbecker wrote:
>> > I got a report lately involving context tracking. Not sure if it's
>> > the same here but the issue was that context tracking uses per cpu data
>> > and per cpu allocation use vmalloc and vmalloc'ed area can fault due to
>> > lazy paging.
>>
>> This is complete nonsense. pcpu allocations are populated right
>> away. Otherwise no single line of kernel code which uses dynamically
>> allocated per cpu storage would be safe.
>
> Note this isn't faulting because part of the allocation is swapped. No
> it's all reserved in the physical memory, but it's a lazy allocation.
> Part of it isn't yet addressed in the P[UGM?]D. That's what vmalloc_fault() is for.
>
> So it's a non-blocking/sleeping fault which is why it's probably fine
> most of the time except on code that isn't fault-safe. And I suspect that
> most people assume that kernel data won't fault so probably some other
> places have similar issues.
>
> That's a long standing issue. We even had to convert the perf callchain
> allocation to ad-hoc kmalloc() based per cpu allocation to get over vmalloc
> faults. At that time, NMIs couldn't handle faults and many callchains were
> populated in NMIs. We had serious crashes because of per cpu memory faults.

Is there seriously more than 512GB of per-cpu virtual space or
whatever's needed to exceed a single pgd on x86_64?

And there are definitely placed that access per-cpu data in contexts
in which a non-IST fault is not allowed.  Maybe not dynamic per-cpu
data, though.

--Andy

>
> I think that lazy adressing is there for allocation performance reasons. But
> still having faultable per cpu memory is insame IMHO.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/