linux-kernel - Re: [PATCH RFC v4 13/14] dm: wait for IO completion before removing dm device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b3c80659-8879-6e7c-e732-5fb690b7bc97@huaweicloud.com>
Date: Wed, 31 Jan 2024 09:35:22 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Yu Kuai <yukuai1@...weicloud.com>, Mikulas Patocka <mpatocka@...hat.com>
Cc: heinzm@...hat.com, xni@...hat.com, agk@...hat.com, snitzer@...nel.org,
 dm-devel@...ts.linux.dev, song@...nel.org, jbrassow@....redhat.com,
 neilb@...e.de, shli@...com, akpm@...l.org, linux-kernel@...r.kernel.org,
 linux-raid@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC v4 13/14] dm: wait for IO completion before removing
 dm device

Hi,

在 2024/01/30 21:05, Yu Kuai 写道:
> Hi,
> 
> 在 2024/01/30 19:46, Mikulas Patocka 写道:
>>
>>
>> On Tue, 30 Jan 2024, Yu Kuai wrote:
>>
>>> From: Yu Kuai <yukuai3@...wei.com>
>>>
>>> __dm_destroy() guarantee that device openers is zero, and then
>>> only call 'presuspend' and 'postsuspend' for the target. For
>>> request-based dm, 'md->holders' will be grabbed for each rq and
>>> __dm_destroy() will wait for 'md->holders' to be zero. However, for
>>> bio-based device, __dm_destroy() doesn't wait for all bios to be done.
>>>
>>> Fix this problem by calling dm_wait_for_completion() to wail for all
>>> inflight IO to be done, like what dm_suspend() does.
>>
>> If the number of openers is zero, it is guaranteed that there are no bios
>> in flight. Therefore, we don't have to wait for them.
>>
>> If there are bios in flight, it is a bug in the code that issues the 
>> bios.
>> You can put WARN_ON(dm_in_flight_bios(md)) there.
> 
> I add this patch because while testing, there is a problem that is
> hard to reporduce, as I mentioned in the other thread. I'll add BUG_ON()
> and try if I can still reporduce this problem without triggering it.
> 
> Thanks,
> Kuai
> 
> [12504.959682] BUG bio-296 (Not tainted): Object already free
> [12504.960239] 
> ----------------------------------------------------------------------------- 
> 
> [12504.960239]
> [12504.961209] Allocated in mempool_alloc+0xe8/0x270 age=30 cpu=1 
> pid=203288
> [12504.961905]  kmem_cache_alloc+0x36a/0x3b0
> [12504.962324]  mempool_alloc+0xe8/0x270
> [12504.962712]  bio_alloc_bioset+0x3b5/0x920
> [12504.963129]  bio_alloc_clone+0x3e/0x160
> [12504.963533]  alloc_io+0x3d/0x1f0
> [12504.963876]  dm_submit_bio+0x12f/0xa30
> [12504.964267]  __submit_bio+0x9c/0xe0
> [12504.964639]  submit_bio_noacct_nocheck+0x25a/0x570
> [12504.965136]  submit_bio_wait+0xc2/0x160
> [12504.965535]  blkdev_issue_zeroout+0x19b/0x2e0
> [12504.965991]  ext4_init_inode_table+0x246/0x560
> [12504.966462]  ext4_lazyinit_thread+0x750/0xbe0
> [12504.966922]  kthread+0x1b4/0x1f0

After adding the BUG_ON(), I can still reporducing this BUG, this really
looks like a BUG, and I don't think this is related to dm-raid. Perhaps
you guys can take a look?

Thanks,
Kuai

>>
>> Mikulas
>>
>>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>>> ---
>>>   drivers/md/dm.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>>> index 8dcabf84d866..2c0eae67d0f1 100644
>>> --- a/drivers/md/dm.c
>>> +++ b/drivers/md/dm.c
>>> @@ -58,6 +58,7 @@ static DEFINE_IDR(_minor_idr);
>>>   static DEFINE_SPINLOCK(_minor_lock);
>>>   static void do_deferred_remove(struct work_struct *w);
>>> +static int dm_wait_for_completion(struct mapped_device *md, unsigned 
>>> int task_state);
>>>   static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
>>> @@ -2495,6 +2496,8 @@ static void __dm_destroy(struct mapped_device 
>>> *md, bool wait)
>>>       if (!dm_suspended_md(md)) {
>>>           dm_table_presuspend_targets(map);
>>>           set_bit(DMF_SUSPENDED, &md->flags);
>>> +        if (wait)
>>> +            dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
>>>           set_bit(DMF_POST_SUSPENDING, &md->flags);
>>>           dm_table_postsuspend_targets(map);
>>>       }
>>> -- 
>>> 2.39.2
>>>
>>
>> .
>>
> 
> .
>