[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8ec393f-5482-4b00-f9a4-91c50cb3b857@acm.org>
Date: Thu, 9 Nov 2017 12:54:06 -0600
From: Corey Minyard <minyard@....org>
To: Andrew Banman <abanman@....com>
Cc: Dimitri Sivanich <sivanich@....com>,
"Anderson, Russ" <russ.anderson@....com>,
Mike Travis <mike.travis@....com>,
openipmi-developer@...ts.sourceforge.net,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drivers/char/ipmi_si: prevent null deref during module
exit
On 11/09/2017 12:38 PM, Corey Minyard wrote:
> On 11/09/2017 12:02 PM, Andrew Banman wrote:
>> On 11/8/17 2:00 PM, Corey Minyard wrote:
>>> On 11/08/2017 11:11 AM, Andrew Banman wrote:
>>>> On 11/8/17 11:06 AM, Andrew Banman wrote:
>>>>> If there are uninitialized SMIs in the smi_infos list, i.e. with no
>>>>> handlers set, then disable_si_irq() in cleanup_smi_one() will hit
>>>>> a null
>>>>> pointer dereference when the former attempts to start the check
>>>>> enables
>>>>> transaction. Thus, we panic during module exit.
>>>> I think this points to a broader problem of holding uninitialized
>>>> smi_info
>>>> structs in smi_infos list. There are many places where handlers and
>>>> other struct
>>>> members are assumed. Maybe a better design would be to remove SMIs
>>>> from the list
>>>> if we have no intention of initializing them?
>>>>
>>> This begs the question: How did you produce this? From what I can
>>> tell,
>>> there is no way you can get to this code if you don't have a working
>>> and
>>> initialized smi_info structure, and that's not the only place this
>>> would
>>> have to be fixed if it wasn't. So it's not what you assume, I don't
>>> think
>>> it's an uninitialized smi_info structure on the list.
>> My kernel is missing 0944d889a2, wherein the dmi handling moved to a
>> platform
>> device. Without this commit the driver discovers both ACPI and SMBIOS
>> records
>> but only initializes the former; this is visible in the tracebacks
>> below.
>>
>> Is there a case for pushing a similar patch to stable releases
>> predating 0944d889a2
>> to prevent this crash?
>
> It looks like you are using a Redhat 3.10 kernel. Is that right?
Actually, never mind, I misread something. The following change should
fix your issue:
index b89078b..d6d9258 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -1805,8 +1805,10 @@ static struct smi_info *smi_info_alloc(void)
{
struct smi_info *info = kzalloc(sizeof(*info), GFP_KERNEL);
- if (info)
+ if (info) {
spin_lock_init(&info->si_lock);
+ info->interrupt_disabled = true;
+ }
return info;
}
@@ -3603,7 +3605,6 @@ static int try_smi_init(struct smi_info *new_smi)
for (i = 0; i < SI_NUM_STATS; i++)
atomic_set(&new_smi->stats[i], 0);
- new_smi->interrupt_disabled = true;
atomic_set(&new_smi->need_watch, 0);
rv = try_enable_event_buffer(new_smi);
>
> -corey
>
>> Thank you for your time,
>>
>> Andrew
>>
>>> As usual with these sorts of things, please send tracebacks and
>>> reproduction
>>> procedures.
>> # dmesg | grep -i ipmi
>> [ 14.883554] ipmi message handler version 39.2
>> [ 14.905543] ipmi device interface
>> [ 14.931149] ipmi_si IPI0001:00: ipmi_si: probing via ACPI
>> [ 14.937254] ipmi_si IPI0001:00: [io 0x0ce4-0x0ce6] regsize 1
>> spacing 1 irq 6
>> [ 14.946767] ipmi_si: Adding ACPI-specified bt state machine
>> [ 14.954621] IPMI System Interface driver.
>> [ 14.975257] ipmi_si: probing via SMBIOS
>> [ 14.980446] ipmi_si: SMBIOS: mem 0xce4 regsize 1 spacing 1 irq 6
>> [ 14.987163] ipmi_si: Adding SMBIOS-specified bt state machine
>> [ 14.996706] ipmi_si: probing via SPMI
>> [ 15.002351] ipmi_si: SPMI: io 0xce4 regsize 1 spacing 1 irq 6
>> [ 15.010320] ipmi_si: Adding SPMI-specified bt state machine
>> duplicate interface
>> [ 15.020056] ipmi_si: Trying ACPI-specified bt state machine at i/o
>> address 0xce4, slave address 0x0, irq 6
>> [ 15.066306] ipmi_si IPI0001:00: Using irq 6
>> [ 15.077855] IPMI BT: req2rsp=2 secs retries=1
>> [ 15.130045] ipmi_si IPI0001:00: Found new BMC (man_id: 0x000000,
>> prod_id: 0x0101, dev_id: 0x20)
>> [ 15.140472] ipmi_si IPI0001:00: IPMI bt interface initialized
>> # rmmod ipmi_si
>> [ 85.492578] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000010
>> [ 85.501346] IP: [<ffffffffc027840e>] start_check_enables+0x3e/0x80
>> [ipmi_si]
>> [ 85.509233] PGD 7fb3875067 PUD 7fbc222067 PMD 0
>> [ 85.514421] Oops: 0000 [#1] SMP
>> [ 85.518056] Modules linked in: xt_CHECKSUM iptable_mangle
>> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
>> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
>> nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables
>> ip6table_filter ip6_tables iptable_filter dm_mirror dm_region_hash
>> dm_log dm_mod iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp
>> vfat intel_rapl fat iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
>> ghash_clmulni_intel aesni_intel lrw i40e gf128mul glue_helper
>> ablk_helper cryptd ptp pcspkr pps_core joydev sg i2c_i801 shpchp
>> lpc_ich ipmi_si(-) wmi ipmi_devintf ipmi_msghandler nfit libnvdimm
>> acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd numatools(OE) grace
>> hwperf(OE) binfmt_misc sunrpc ip_tables sr_mod cdrom xfs libcrc32c
>> sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit
>> drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul
>> crct10dif_common sysimgblt crc32c_intel fb_sys_fops ahci ttm libahci
>> uas drm libata usb_storage i2c_core
>> [ 85.616799] CPU: 241 PID: 5743 Comm: rmmod Tainted: G
>> OE ------------ 3.10.0-693.el7.x86_64 #1
>> [ 85.627674] Hardware name: HPE Superdome Flex/Superdome Flex, BIOS
>> IP147.006.000.137.000.1711022001 11/02/2017
>> [ 85.638843] task: ffff887fb8ac6eb0 ti: ffff887ffd30c000 task.ti:
>> ffff887ffd30c000
>> [ 85.647200] RIP: 0010:[<ffffffffc027840e>] [<ffffffffc027840e>]
>> start_check_enables+0x3e/0x80 [ipmi_si]
>> [ 85.657794] RSP: 0018:ffff887ffd30fea8 EFLAGS: 00010246
>> [ 85.663723] RAX: 0000000000000000 RBX: ffff88016930b000 RCX:
>> 0000000000000006
>> [ 85.671688] RDX: 0000000000000002 RSI: ffff887ffd30feae RDI:
>> 0000000000000000
>> [ 85.679655] RBP: ffff887ffd30fec0 R08: ffff88016930b1a0 R09:
>> 00000001820001fb
>> [ 85.687623] R10: 00000000beb72001 R11: ffffea013efadc80 R12:
>> ffffffffc02822c0
>> [ 85.695589] R13: 0000000000000800 R14: 0000000001f222b0 R15:
>> 0000000001f22010
>> [ 85.703555] FS: 00007fb9a5c10740(0000) GS:ffff887fbea40000(0000)
>> knlGS:0000000000000000
>> [ 85.712587] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 85.719002] CR2: 0000000000000010 CR3: 0000007ffd2a5000 CR4:
>> 00000000003407e0
>> [ 85.726968] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [ 85.734936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> [ 85.742904] Stack:
>> [ 85.745150] 2f18887ffd30fec0 000000007706aaa0 ffff88016930b000
>> ffff887ffd30fed8
>> [ 85.753456] ffffffffc027ae19 ffffffffc0281ff0 ffff887ffd30fef0
>> ffffffffc027b9cb
>> [ 85.761762] fffffffffffffff5 ffff887ffd30ff78 ffffffff810fe55b
>> 0000000000000000
>> [ 85.770063] Call Trace:
>> [ 85.772797] [<ffffffffc027ae19>] cleanup_one_si+0x149/0x180
>> [ipmi_si]
>> [ 85.780090] [<ffffffffc027b9cb>] cleanup_ipmi_si+0x6b/0xc0 [ipmi_si]
>> [ 85.787293] [<ffffffff810fe55b>] SyS_delete_module+0x19b/0x300
>> [ 85.793913] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
>> [ 85.800621] Code: ee 18 c6 45 ef 2f 65 48 8b 04 25 28 00 00 00 48
>> 89 45 f0 31 c0 40 84 f6 75 33 48 8b 47 18 ba 02 00 00 00 48 8b 7f 10
>> 48 8d 75 ee <ff> 50 10 48 8b 45 f0 65 48 33 04 25 28 00 00 00 c7 43
>> 38 05 00
>> [ 85.822638] RIP [<ffffffffc027840e>]
>> start_check_enables+0x3e/0x80 [ipmi_si]
>> [ 85.830611] RSP <ffff887ffd30fea8>
>> [ 85.834504] CR2: 0000000000000010
>>
>>
>> And with debug printks printing out some contents of the struct at
>> the start of
>> module exit:
>>
>> # rmmod ipmi_si; dmesg -c
>> [ 1083.527776] smi_info ffff880fbb38f600
>> [ 1083.527779] intf_num 0
>> [ 1083.527779] intf ffff880fbb800000
>> [ 1083.527780] *si_sm ffff880fbb144400
>> [ 1083.527780] *handlers ffffffffa051bac0
>> [ 1083.527780] si_type 2
>>
>> [ 1083.527782] smi_info ffff880fbb38f200
>> [ 1083.527783] intf_num 0
>> [ 1083.527783] intf (null)
>> [ 1083.527784] *si_sm (null)
>> [ 1083.527785] *handlers (null)
>> [ 1083.527786] si_type 2
>>
>> @@ -3961,6 +3985,15 @@ static void cleanup_ipmi_si(void)
>> if (!initialized)
>> return;
>> + list_for_each_entry_safe(e, tmp_e, &smi_infos, link) {
>> + pr_crit("%20s %p\n", "smi_info", e);
>> + pr_crit("%20s %d\n", "intf_num", e->intf_num);
>> + pr_crit("%20s %p\n", "intf", e->intf);
>> + pr_crit("%20s %p\n", "*si_sm", e->si_sm);
>> + pr_crit("%20s %p\n", "*handlers", e->handlers);
>> + pr_crit("%20s %d\n\n", "si_type", e->si_type);
>> + }
>> +
>> #ifdef CONFIG_PCI
>> if (pci_registered)
>> pci_unregister_driver(&ipmi_pci_driver);
>>> If you are removing the module and this happens, there may be a race
>>> conditions, but this is the wrong fix. More likely that the
>>> structure gets
>>> cleaned up and this function is called afterwards.
>>>
>>> -corey
>>>
>>>> Andrew
>>>>
>>>>> Avoid panicking when there are uninitialized SMIs by checking for
>>>>> a handler
>>>>> pointer before starting the check enables transaction.
>>>>>
>>>>> Signed-off-by: Andrew Banman <abanman@....com>
>>>>> ---
>>>>> drivers/char/ipmi/ipmi_si_intf.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/char/ipmi/ipmi_si_intf.c
>>>>> b/drivers/char/ipmi/ipmi_si_intf.c
>>>>> index cb5719e..6c0b1b3 100644
>>>>> --- a/drivers/char/ipmi/ipmi_si_intf.c
>>>>> +++ b/drivers/char/ipmi/ipmi_si_intf.c
>>>>> @@ -442,7 +442,7 @@ static void start_check_enables(struct smi_info
>>>>> *smi_info, bool start_timer)
>>>>>
>>>>> if (start_timer)
>>>>> start_new_msg(smi_info, msg, 2);
>>>>> - else
>>>>> + else if (smi_info->handlers)
>>>>> smi_info->handlers->start_transaction(smi_info->si_sm, msg, 2);
>>>>> smi_info->si_state = SI_CHECKING_ENABLES;
>>>>> }
>
>
>
Powered by blists - more mailing lists