[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1335799690.2351.100.camel@groeck-laptop>
Date: Mon, 30 Apr 2012 08:28:10 -0700
From: Guenter Roeck <guenter.roeck@...csson.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
CC: Fenghua Yu <fenghua.yu@...el.com>, Andi Kleen <ak@...ux.intel.com>,
Jean Delvare <khali@...ux-fr.org>,
"lm-sensors@...sensors.org" <lm-sensors@...sensors.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] hwmon: coretemp: fix oops on cpu unplug
On Mon, 2012-04-30 at 09:18 -0400, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
>
> coretemp tries to access core_data array beyond bounds on cpu unplug if
> core id of the cpu if more than NUM_REAL_CORES-1.
>
> BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
> IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> PGD 673e5a067 PUD 66e9b3067 PMD 0
> Oops: 0000 [#1] SMP
> CPU 79
> Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]
>
> Pid: 3315, comm: set-cpus Tainted: G W 3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
> RIP: 0010:[<ffffffffa00159af>] [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> RSP: 0018:ffff880472fb3d48 EFLAGS: 00010246
> RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
> R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
> R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
> FS: 00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
> Stack:
> ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
> 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
> ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
> Call Trace:
> [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
> [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
> [<ffffffff81059d30>] __cpu_notify+0x20/0x40
> [<ffffffff815fa251>] _cpu_down+0x81/0x270
> [<ffffffff815fa477>] cpu_down+0x37/0x50
> [<ffffffff815fd6a3>] store_online+0x63/0xc0
> [<ffffffff813c7078>] dev_attr_store+0x18/0x30
> [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
> [<ffffffff81180443>] vfs_write+0xb3/0x180
> [<ffffffff8118076a>] sys_write+0x4a/0x90
> [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
> Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
> RIP [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
> RSP <ffff880472fb3d48>
> CR2: 000000000000013c
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> ---
> drivers/hwmon/coretemp.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 0d3141f..54a70fe 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -709,6 +709,10 @@ static void __cpuinit put_core_offline(unsigned int cpu)
>
> indx = TO_ATTR_NO(cpu);
>
> + /* The core id is too big, just return */
> + if (indx > MAX_CORE_DATA - 1)
> + return;
> +
> if (pdata->core_data[indx] && pdata->core_data[indx]->cpu == cpu)
> coretemp_remove_core(pdata, &pdev->dev, indx);
>
Hi,
good catch. Couple of problems, though.
First, what number of cores are we talking about ? We should probably
increase NUM_REAL_CORES as well. Long term, we should get rid of the
dependency to prevent that problem from happening again, but that is a
different issue.
Second, we'll need the same code in get_core_online(). Otherwise the
platform device can be created for the new core (if the core is
re-enabled) but will never be deleted.
Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists