[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1603172146420.3978@nanos>
Date: Thu, 17 Mar 2016 22:01:38 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Josh Boyer <jwboyer@...oraproject.org>
cc: "Richard W.M. Jones" <rjones@...hat.com>, x86 <x86@...nel.org>,
"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>
Subject: Re: Oops from calibrate_delay_is_known on qemu machine with Linux
v4.5-1523-g271ecc5253e2
Josh,
On Thu, 17 Mar 2016, Josh Boyer wrote:
> We've had a report [1] of the mainline kernel crashing on a single-cpu
> QEMU machine (not kvm) in Fedora. It looks as if the emulated machine
> is failing to provide a TSC and the calibrate_delay_is_known function
> is passing NULL to cpumask_any_but for the mask parameter. At least
> that's all I've been able to discern thus far.
>
> I was wondering if you had any insight into this issue, given your
> recent commit to change calibrate_delay_is_known to use
> topology_core_cpumask. The backtrace is below.
> at (null)
> [ 0.010000] IP: [<ffffffff814698b5>] _find_next_bit.part.0+0x15/0x70
> [ 0.010000] PGD 0
>
> [ 0.010000] RSP: 0000:ffffffff81e03e40 EFLAGS: 00000246
> [ 0.010000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 0.010000] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
> [ 0.010000] RBP: ffffffff81e03e50 R08: ffffffffffffffff R09: 0000000000000000
> [ 0.010000] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 0.010000] R13: ffffffff82248960 R14: ffffffff822562e0 R15: 0000000000000000
> [ 0.010000] FS: 0000000000000000(0000) GS:ffff88001ee00000(0000)
> knlGS:0000000000000000
> [ 0.010000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.010000] CR2: 0000000000000000 CR3: 0000000001e06000 CR4: 00000000000006b0
> [ 0.010000] Stack:
> [ 0.010000] ffffffff81e03e50 ffffffff81469928 ffffffff81e03e70
> ffffffff81453d56
> [ 0.010000] 0000000000000000 ffff88001f3fa780 ffffffff81e03e80
> ffffffff81040495
> [ 0.010000] ffffffff81e03f40 ffffffff8100285a ffffffff810eefb3
> ffffffff00000000
> [ 0.010000] Call Trace:
> [ 0.010000] [<ffffffff81469928>] ? find_next_bit+0x18/0x20
> [ 0.010000] [<ffffffff81453d56>] cpumask_any_but+0x26/0x50
Yuck. That requires that topology_core_cpumask(cpu) is NULL.
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
...
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
So that can only result in a NULL pointer if you CONFIG_CPUMASK_OFFSTACK
enabled and the allocation fails, which is not checked !?@!
I tried to reproduce with Richards script, but so far no dice. Can you please
provide your kernel config?
Thanks,
tglx
Powered by blists - more mailing lists