linux-kernel - Re: kvm_intel: Could not allocate 42 bytes percpu data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 26 Jun 2013 21:53:37 -0300
From:	Marcelo Tosatti <mtosatti@...hat.com>
To:	Prarit Bhargava <prarit@...hat.com>
Cc:	Chegu Vinod <chegu_vinod@...com>, rusty@...tcorp.com.au,
	LKML <linux-kernel@...r.kernel.org>,
	Gleb Natapov <gleb@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>, KVM <kvm@...r.kernel.org>
Subject: Re: kvm_intel: Could not allocate 42 bytes percpu data

On Mon, Jun 24, 2013 at 06:52:44PM -0400, Prarit Bhargava wrote:
> 
> 
> On 06/24/2013 03:01 PM, Chegu Vinod wrote:
> > 
> > Hello,
> > 
> > Lots (~700+) of the following messages are showing up in the dmesg of a 3.10-rc1
> > based kernel (Host OS is running on a large socket count box with HT-on).
> > 
> > [   82.270682] PERCPU: allocation failed, size=42 align=16, alloc from reserved
> > chunk failed
> > [   82.272633] kvm_intel: Could not allocate 42 bytes percpu data
> 
> On 3.10?  Geez.  I thought we had fixed this.  I'll grab a big machine and see
> if I can debug.
> 
> Rusty -- any ideas off the top of your head?'

As far as my limited understanding goes, the reserved space setup by
arch code for percpu allocations, is limited and subject to exhaustion.

It would be best if the allocator could handle the allocation, but
otherwise, switching vmx.c to dynamic allocations for the percpu
regions is an option (see 013f6a5d3dd9e4).

It should be similar to convert these two larger data structures:

static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
static DEFINE_PER_CPU(struct desc_ptr, host_gdt);


> > 
> > ... also call traces like the following...
> > 
> > [  101.852136]  ffffc901ad5aa090 ffff88084675dd08 ffffffff81633743 ffff88084675ddc8
> > [  101.860889]  ffffffff81145053 ffffffff81f3fa78 ffff88084809dd40 ffff8907d1cfd2e8
> > [  101.869466]  ffff8907d1cfd280 ffff88087fffdb08 ffff88084675c010 ffff88084675dfd8
> > [  101.878190] Call Trace:
> > [  101.880953]  [<ffffffff81633743>] dump_stack+0x19/0x1e
> > [  101.886679]  [<ffffffff81145053>] pcpu_alloc+0x9a3/0xa40
> > [  101.892754]  [<ffffffff81145103>] __alloc_reserved_percpu+0x13/0x20
> > [  101.899733]  [<ffffffff810b2d7f>] load_module+0x35f/0x1a70
> > [  101.905835]  [<ffffffff8163ad6e>] ? do_page_fault+0xe/0x10
> > [  101.911953]  [<ffffffff810b467b>] SyS_init_module+0xfb/0x140
> > [  101.918287]  [<ffffffff8163f542>] system_call_fastpath+0x16/0x1b
> > [  101.924981] kvm_intel: Could not allocate 42 bytes percpu data
> > 
> > 
> > Wondering if anyone else has seen this with the recent [3.10] based kernels esp.
> > on larger boxes?
> > 
> > There was a similar issue that was reported earlier (where modules were being
> > loaded per cpu without checking if an instance was already loaded/being-loaded).
> > That issue seems to have been addressed in the recent past (e.g.
> > https://lkml.org/lkml/2013/1/24/659 along with a couple of follow on cleanups)  
> > Is the above yet another variant of the original issue or perhaps some race
> > condition that got exposed when there are lot more threads ?
> 
> Hmm ... not sure but yeah, that's the likely culprit.
> 
> P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/