[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bfd99f3e-68db-d5bc-678b-b998ac447e72@caviumnetworks.com>
Date: Tue, 28 Aug 2018 20:02:40 +0530
From: George Cherian <gcherian@...iumnetworks.com>
To: minyard@....org, George Cherian <george.cherian@...ium.com>,
linux-kernel@...r.kernel.org,
openipmi-developer@...ts.sourceforge.net
Cc: arnd@...db.de, gregkh@...uxfoundation.org
Subject: Re: [PATCH 1/2] ipmi_ssif: Unregister i2c device only if created by
ssif
Hi Corey,
On 08/28/2018 04:59 AM, Corey Minyard wrote:
>
> On 08/27/2018 01:07 AM, George Cherian wrote:
>>
>> Hi Corey,
>>
>> On 08/24/2018 06:37 PM, Corey Minyard wrote:
>>>
>>> On 08/24/2018 06:10 AM, George Cherian wrote:
>>>> In ssif_probe error path the i2c client is left hanging, so that
>>>> ssif_platform_remove will remove the client. But it is quite
>>>> possible that ssif would never call an i2c_new_device.
>>>> This condition would lead to kernel crash as below.
>>>> To fix this leave only the client ssif registered hanging in error
>>>> path. All other non-registered clients are set to NULL.
>>>
>>> I'm having a hard time seeing how this could happen.
>>>
>>> The i2c_new_device() call is only done in the case of dmi_ipmi_probe
>>> (called from
>>> ssif_platform_probe) or a hard-coded entry. How does
>>> ssif_platform_remove get
>>> called on a device that was not registered with ssif_platform_probe?
>>>
>>
>> Initially I also had the same doubt but then,
>> ssif_adapter_hanlder is called for each i2c_dev only after initialized
>> is true. So we end up not calling i2c_new_device for devices probed
>> during the module_init time.
>>
>
> I spent some time looking at this, and I don't think that's what is
> happening.
>
> I think that i2c_del_driver() in cleanup_ipmi_ssif() is causing
> i2c_unregister_device() to be called on all the devices, and
> platform_driver_unregister() causes it to be called on the
> devices that came in through the platform method. It's
> a double-free.
>
> Try reversing the order of i2c_del_driver() and
> platform_driver_unregister()
> in cleanup_ipmi_ssif() to test this.
>
Reversing the call order didn't help, I am still getting the trace.
You are partly correct on the double free scenario. I dont see double
free in normal operation. I see a double free only in probe failure case.
I have added prints in i2c_unregister_device() to print the client.
pr_err("client = %px\n", client);
In normal case, I get 2 calls to i2c_unregister_device()
Call 1: i2c-core: client = ffff800ada315400 => called from
i2c_del_driver()
This in turn calls ssif_remove, where we set addr_info->client to NULL.
Call 2: i2c-core: client = 0000000000000000 => called from
ssif_platform_remove()
This is fine since i2c_unregister_device is NULL aware.
This works fine without crashing .
Now in the probe failing case, I get 2 calls to i2c_unregister_device()
Call 1: i2c-core: client = ffff800ad9897400 => called from
i2c_del_driver()
This never calls ssif_remove, since the probe failed.
Call 2: i2c-core: client = ffff800ad9897400 => called from
ssif_platform_remove()
This is a case of double free.
Do you think the proposed patch itself is the solution or
Is it that we should really leave addr_info->client hanging in probe
error path at all?
NB: For easy simulation of the ssif_probe failing case I just replaced
the
ssif_info->thread = kthread_run(....) with
ssif_info->thread = ERR_PTR(-4); so that the probe takes the goto out path.
-George
> -corey
>
>> ssif_platform_remove() get called during removal of ipmi_ssif.
>> In case during ssif_probe() if there is a failure before
>> ipmi_smi_register then we leave the addr_info->client hanging.
>>
>> In case of normal functioning without error, we set addr_info->client
>> to NULL after ipmi_unregiter_smi in ssif_remove.
>>
>>> Small style comment inline...
>> I will make the changess and sent out a v2!!
>>
>> Thanks,
>> -George
>>>
>>>> CPU: 107 PID: 30266 Comm: rmmod Not tainted 4.18.0+ #80
>>>> Hardware name: Cavium Inc. Saber/Saber, BIOS Cavium reference
>>>> firmware version 7.0 08/04/2018
>>>> pstate: 60400009 (nZCv daif +PAN -UAO)
>>>> pc : kernfs_find_ns+0x28/0x120
>>>> lr : kernfs_find_and_get_ns+0x40/0x60
>>>> sp : ffff00002310fb50
>>>> x29: ffff00002310fb50 x28: ffff800a8240f800
>>>> x27: 0000000000000000 x26: 0000000000000000
>>>> x25: 0000000056000000 x24: ffff000009073000
>>>> x23: ffff000008998b38 x22: 0000000000000000
>>>> x21: ffff800ed86de820 x20: 0000000000000000
>>>> x19: ffff00000913a1d8 x18: 0000000000000000
>>>> x17: 0000000000000000 x16: 0000000000000000
>>>> x15: 0000000000000000 x14: 5300737265766972
>>>> x13: 643d4d4554535953 x12: 0000000000000030
>>>> x11: 0000000000000030 x10: 0101010101010101
>>>> x9 : ffff800ea06cc3f9 x8 : 0000000000000000
>>>> x7 : 0000000000000141 x6 : ffff000009073000
>>>> x5 : ffff800adb706b00 x4 : 0000000000000000
>>>> x3 : 00000000ffffffff x2 : 0000000000000000
>>>> x1 : ffff000008998b38 x0 : ffff000008356760
>>>> Process rmmod (pid: 30266, stack limit = 0x00000000e218418d)
>>>> Call trace:
>>>> kernfs_find_ns+0x28/0x120
>>>> kernfs_find_and_get_ns+0x40/0x60
>>>> sysfs_unmerge_group+0x2c/0x6c
>>>> dpm_sysfs_remove+0x34/0x70
>>>> device_del+0x58/0x30c
>>>> device_unregister+0x30/0x7c
>>>> i2c_unregister_device+0x84/0x90 [i2c_core]
>>>> ssif_platform_remove+0x38/0x98 [ipmi_ssif]
>>>> platform_drv_remove+0x2c/0x6c
>>>> device_release_driver_internal+0x168/0x1f8
>>>> driver_detach+0x50/0xbc
>>>> bus_remove_driver+0x74/0xe8
>>>> driver_unregister+0x34/0x5c
>>>> platform_driver_unregister+0x20/0x2c
>>>> cleanup_ipmi_ssif+0x50/0xd82c [ipmi_ssif]
>>>> __arm64_sys_delete_module+0x1b4/0x220
>>>> el0_svc_handler+0x104/0x160
>>>> el0_svc+0x8/0xc
>>>> Code: aa1e03e0 aa0203f6 aa0103f7 d503201f (7940e280)
>>>> ---[ end trace 09f0e34cce8e2d8c ]---
>>>> Kernel panic - not syncing: Fatal exception
>>>> SMP: stopping secondary CPUs
>>>> Kernel Offset: disabled
>>>> CPU features: 0x23800c38
>>>>
>>>> Signed-off-by: George Cherian <george.cherian@...ium.com>
>>>> ---
>>>> drivers/char/ipmi/ipmi_ssif.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/char/ipmi/ipmi_ssif.c
>>>> b/drivers/char/ipmi/ipmi_ssif.c
>>>> index 18e4650..ccdf6b1 100644
>>>> --- a/drivers/char/ipmi/ipmi_ssif.c
>>>> +++ b/drivers/char/ipmi/ipmi_ssif.c
>>>> @@ -181,6 +181,7 @@ struct ssif_addr_info {
>>>> struct device *dev;
>>>> struct i2c_client *client;
>>>>
>>>> + bool client_registered;
>>>> struct mutex clients_mutex;
>>>> struct list_head clients;
>>>>
>>>> @@ -1658,6 +1659,8 @@ static int ssif_probe(struct i2c_client
>>>> *client, const struct i2c_device_id *id)
>>>> * the client like it should.
>>>> */
>>>> dev_err(&client->dev, "Unable to start IPMI SSIF:
>>>> %d\n", rv);
>>>> + if (!addr_info->client_registered)
>>>> + addr_info->client = NULL;
>>>> kfree(ssif_info);
>>>> }
>>>> kfree(resp);
>>>> @@ -1672,11 +1675,14 @@ static int ssif_probe(struct i2c_client
>>>> *client, const struct i2c_device_id *id)
>>>> static int ssif_adapter_handler(struct device *adev, void *opaque)
>>>> {
>>>> struct ssif_addr_info *addr_info = opaque;
>>>> + struct i2c_client *client;
>>>>
>>>> if (adev->type != &i2c_adapter_type)
>>>> return 0;
>>>>
>>>> - i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo);
>>>> + client = i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo);
>>>> + if (client)
>>>> + addr_info->client_registered = true;
>>>>
>>>
>>> How about..
>>> if (i2c_new_device(to_i2c_adapter(adev), &addr_info->binfo))
>>> addr_info->client_registered = true;
>>>
>>> No need for the client variable.
>>>
>>> -corey
>>>
>>>> if (!addr_info->adapter_name)
>>>> return 1; /* Only try the first I2C adapter by
>>>> default. */
>>>
>>>
>
Powered by blists - more mailing lists