[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cfef4358-aa4c-6d21-1838-5ae2695fab87@huawei.com>
Date: Thu, 31 Oct 2019 18:35:33 +0000
From: John Garry <john.garry@...wei.com>
To: Jens Axboe <axboe@...nel.dk>
CC: <linux-ide@...r.kernel.org>, <linuxarm@...wei.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] libata: Ensure ata_port probe has completed before detach
On 16/10/2019 20:09, Jens Axboe wrote:
> On 10/16/19 4:19 AM, John Garry wrote:
>> With CONFIG_DEBUG_TEST_DRIVER_REMOVE set, we may find the following WARN:
>>
>> [ 23.452574] ------------[ cut here ]------------
>> [ 23.457190] WARNING: CPU: 59 PID: 1 at drivers/ata/libata-core.c:6676 ata_host_detach+0x15c/0x168
>> [ 23.466047] Modules linked in:
>> [ 23.469092] CPU: 59 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc1-00010-g5b83fd27752b-dirty #296
>> [ 23.477776] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
>> [ 23.486286] pstate: a0c00009 (NzCv daif +PAN +UAO)
>> [ 23.491065] pc : ata_host_detach+0x15c/0x168
>> [ 23.495322] lr : ata_host_detach+0x88/0x168
>> [ 23.499491] sp : ffff800011cabb50
>> [ 23.502792] x29: ffff800011cabb50 x28: 0000000000000007
>> [ 23.508091] x27: ffff80001137f068 x26: ffff8000112c0c28
>> [ 23.513390] x25: 0000000000003848 x24: ffff0023ea185300
>> [ 23.518689] x23: 0000000000000001 x22: 00000000000014c0
>> [ 23.523987] x21: 0000000000013740 x20: ffff0023bdc20000
>> [ 23.529286] x19: 0000000000000000 x18: 0000000000000004
>> [ 23.534584] x17: 0000000000000001 x16: 00000000000000f0
>> [ 23.539883] x15: ffff0023eac13790 x14: ffff0023eb76c408
>> [ 23.545181] x13: 0000000000000000 x12: ffff0023eac13790
>> [ 23.550480] x11: ffff0023eb76c228 x10: 0000000000000000
>> [ 23.555779] x9 : ffff0023eac13798 x8 : 0000000040000000
>> [ 23.561077] x7 : 0000000000000002 x6 : 0000000000000001
>> [ 23.566376] x5 : 0000000000000002 x4 : 0000000000000000
>> [ 23.571674] x3 : ffff0023bf08a0bc x2 : 0000000000000000
>> [ 23.576972] x1 : 3099674201f72700 x0 : 0000000000400284
>> [ 23.582272] Call trace:
>> [ 23.584706] ata_host_detach+0x15c/0x168
>> [ 23.588616] ata_pci_remove_one+0x10/0x18
>> [ 23.592615] ahci_remove_one+0x20/0x40
>> [ 23.596356] pci_device_remove+0x3c/0xe0
>> [ 23.600267] really_probe+0xdc/0x3e0
>> [ 23.603830] driver_probe_device+0x58/0x100
>> [ 23.608000] device_driver_attach+0x6c/0x90
>> [ 23.612169] __driver_attach+0x84/0xc8
>> [ 23.615908] bus_for_each_dev+0x74/0xc8
>> [ 23.619730] driver_attach+0x20/0x28
>> [ 23.623292] bus_add_driver+0x148/0x1f0
>> [ 23.627115] driver_register+0x60/0x110
>> [ 23.630938] __pci_register_driver+0x40/0x48
>> [ 23.635199] ahci_pci_driver_init+0x20/0x28
>> [ 23.639372] do_one_initcall+0x5c/0x1b0
>> [ 23.643199] kernel_init_freeable+0x1a4/0x24c
>> [ 23.647546] kernel_init+0x10/0x108
>> [ 23.651023] ret_from_fork+0x10/0x18
>> [ 23.654590] ---[ end trace 634a14b675b71c13 ]---
>>
>> With KASAN also enabled, we may also get many use-after-free reports.
>>
>> The issue is that when CONFIG_DEBUG_TEST_DRIVER_REMOVE is set, we may
>> attempt to detach the ata_port before it has been probed.
>>
>> This is because the ata_ports are async probed, meaning that there is no
>> guarantee that the ata_port has probed prior to detach. When the ata_port
>> does probe in this scenario, we get all sorts of issues as the detach may
>> have already happened.
>>
>> Fix by ensuring synchronisation with async_synchronize_full(). We could
>> alternatively use the cookie returned from the ata_port probe
>> async_schedule() call, but that means managing the cookie, so more
>> complicated.
>>
>> Signed-off-by: John Garry <john.garry@...wei.com>
>> ---
>> Note: This has only been boot tested and manual driver remove/add.
>> My system has no disk attached to the ahci host.
>>
>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>> index 28c492be0a57..74c9b3032d46 100644
>> --- a/drivers/ata/libata-core.c
>> +++ b/drivers/ata/libata-core.c
>> @@ -6708,6 +6708,9 @@ void ata_host_detach(struct ata_host *host)
>> {
>> int i;
>>
>> + /* Ensure ata_port probe has completed */
>> + async_synchronize_full();
>> +
>> for (i = 0; i < host->n_ports; i++)
>> ata_port_detach(host->ports[i]);
>>
>>
>
> Nice debugging, and the fix looks appropriate to me. I don't think
> there's any point in trying to individually synchronize cookies.
> I'll let this simmer on the list for a day or two to let other folks
> take a look at it, before queuing it up.
>
Hi Jens,
FWIW, I did also now test this on qemu with an emulated disk and it was ok.
Anyway, I don't mind if prefer to queue this early for 5.6 so it can sit
on next for longer.
Cheers,
John
Powered by blists - more mailing lists