linux-kernel - Re: [PATCH v5 09/14] dm-raid: really frozen sync

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c062731-3b74-2b3e-94c8-ffdf940df014@huaweicloud.com>
Date: Sun, 18 Feb 2024 14:34:00 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Xiao Ni <xni@...hat.com>, Yu Kuai <yukuai1@...weicloud.com>
Cc: mpatocka@...hat.com, heinzm@...hat.com, blazej.kucman@...ux.intel.com,
 agk@...hat.com, snitzer@...nel.org, dm-devel@...ts.linux.dev,
 song@...nel.org, jbrassow@....redhat.com, neilb@...e.de, shli@...com,
 akpm@...l.org, linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
 yi.zhang@...wei.com, yangerkun@...wei.com, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v5 09/14] dm-raid: really frozen sync_thread during
 suspend

Hi,

在 2024/02/18 12:53, Xiao Ni 写道:
> Hi Kuai
> 
> On Thu, Feb 1, 2024 at 5:30 PM Yu Kuai <yukuai1@...weicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> 1) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen,
>>     it only prevent new sync_thread to start, and it can't stop the
>>     running sync thread;
> 
> Agree with this
> 
>> 2) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use
>>     it as condition for md_stop_writes() in raid_postsuspend() doesn't
>>     look correct.
> 
> I don't agree with it. __md_stop_writes stops sync thread, so it needs
> to check this flag. And It looks like the name __md_stop_writes is not
> right. Does it really stop write io? mddev_suspend should be the
> function that stop write request. From my understanding,
> raid_postsuspend does two jobs. One is stopping sync thread. Two is
> suspending array.

MD_RECOVERY_FROZEN is not just used in __md_stop_writes(), so I think
it's not correct to to check this. For example, if MD_RECOVERY_FROZEN is
set by raid_message(), then __md_stop_writes() will be skipped.

> 
>> 3) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime,
>>     and if MD_RECOVERY_FROZEN is cleared while the array is suspended,
>>     new sync_thread can start unexpected.
> 
> md_action_store doesn't check this either. If the array is suspended
> and MD_RECOVERY_FROZEN is cleared, before patch01, sync thread can't
> happen. So it looks like patch01 breaks the logic.

The difference is that md/raid doen't need to frozen sync_thread while
suspending the array for now. And I don't understand at all why sync
thread can't happed before patch01.

Thanks,
Kuai

> 
> Regards
> Xiao
> 
> 
>>
>> Fix above problems by using the new helper to suspend the array during
>> suspend, also disallow raid_message() to change sync_thread status
>> during suspend.
>>
>> Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the
>> test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(),
>> and with previous fixes, the test won't hang there anymore, however, the
>> test will still fail and complain that ext4 is corrupted. And with this
>> patch, the test won't hang due to stop_sync_thread() or fail due to ext4
>> is corrupted anymore. However, there is still a deadlock related to
>> dm-raid456 that will be fixed in following patches.
>>
>> Reported-by: Mikulas Patocka <mpatocka@...hat.com>
>> Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/
>> Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_writes()")
>> Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target")
>> Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>>   drivers/md/dm-raid.c | 38 +++++++++++++++++++++++++++++---------
>>   1 file changed, 29 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
>> index eb009d6bb03a..5ce3c6020b1b 100644
>> --- a/drivers/md/dm-raid.c
>> +++ b/drivers/md/dm-raid.c
>> @@ -3240,11 +3240,12 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
>>          rs->md.ro = 1;
>>          rs->md.in_sync = 1;
>>
>> -       /* Keep array frozen until resume. */
>> -       set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
>> -
>>          /* Has to be held on running the array */
>>          mddev_suspend_and_lock_nointr(&rs->md);
>> +
>> +       /* Keep array frozen until resume. */
>> +       md_frozen_sync_thread(&rs->md);
>> +
>>          r = md_run(&rs->md);
>>          rs->md.in_sync = 0; /* Assume already marked dirty */
>>          if (r) {
>> @@ -3722,6 +3723,9 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
>>          if (!mddev->pers || !mddev->pers->sync_request)
>>                  return -EINVAL;
>>
>> +       if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags))
>> +               return -EBUSY;
>> +
>>          if (!strcasecmp(argv[0], "frozen"))
>>                  set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>          else
>> @@ -3791,15 +3795,31 @@ static void raid_io_hints(struct dm_target *ti, struct queue_limits *limits)
>>          blk_limits_io_opt(limits, chunk_size_bytes * mddev_data_stripes(rs));
>>   }
>>
>> +static void raid_presuspend(struct dm_target *ti)
>> +{
>> +       struct raid_set *rs = ti->private;
>> +
>> +       mddev_lock_nointr(&rs->md);
>> +       md_frozen_sync_thread(&rs->md);
>> +       mddev_unlock(&rs->md);
>> +}
>> +
>> +static void raid_presuspend_undo(struct dm_target *ti)
>> +{
>> +       struct raid_set *rs = ti->private;
>> +
>> +       mddev_lock_nointr(&rs->md);
>> +       md_unfrozen_sync_thread(&rs->md);
>> +       mddev_unlock(&rs->md);
>> +}
>> +
>>   static void raid_postsuspend(struct dm_target *ti)
>>   {
>>          struct raid_set *rs = ti->private;
>>
>>          if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) {
>>                  /* Writes have to be stopped before suspending to avoid deadlocks. */
>> -               if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
>> -                       md_stop_writes(&rs->md);
>> -
>> +               md_stop_writes(&rs->md);
>>                  mddev_suspend(&rs->md, false);
>>          }
>>   }
>> @@ -4012,8 +4032,6 @@ static int raid_preresume(struct dm_target *ti)
>>          }
>>
>>          /* Check for any resize/reshape on @rs and adjust/initiate */
>> -       /* Be prepared for mddev_resume() in raid_resume() */
>> -       set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>          if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
>>                  set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
>>                  mddev->resync_min = mddev->recovery_cp;
>> @@ -4056,9 +4074,9 @@ static void raid_resume(struct dm_target *ti)
>>                          rs_set_capacity(rs);
>>
>>                  mddev_lock_nointr(mddev);
>> -               clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>                  mddev->ro = 0;
>>                  mddev->in_sync = 0;
>> +               md_unfrozen_sync_thread(mddev);
>>                  mddev_unlock_and_resume(mddev);
>>          }
>>   }
>> @@ -4074,6 +4092,8 @@ static struct target_type raid_target = {
>>          .message = raid_message,
>>          .iterate_devices = raid_iterate_devices,
>>          .io_hints = raid_io_hints,
>> +       .presuspend = raid_presuspend,
>> +       .presuspend_undo = raid_presuspend_undo,
>>          .postsuspend = raid_postsuspend,
>>          .preresume = raid_preresume,
>>          .resume = raid_resume,
>> --
>> 2.39.2
>>
> 
> .
>