[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <13706e9e-541a-4c09-b104-2d7272d0a2fa@fnnas.com>
Date: Thu, 25 Dec 2025 15:32:17 +0800
From: "Yu Kuai" <yukuai@...as.com>
To: <linan666@...weicloud.com>, <song@...nel.org>
Cc: <linux-raid@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<yangerkun@...wei.com>, <yi.zhang@...wei.com>, <yukuai@...as.com>
Subject: Re: [PATCH] md/raid5: Fix a deadlock of reshape and suspend
Hi,
在 2025/11/24 16:45, linan666@...weicloud.com 写道:
> From: Li Nan <linan122@...wei.com>
>
> Commit 868bba54a3bc ("md/raid5: fix a deadlock in the case that reshape is
> interrupted") fixed a raid deadlock of reshape, but a similar issue is hit
> by mdadm test 25raid456-reshape-deadlock.
>
> INFO: task (udev-worker):63822 blocked for more than 122 seconds.
> Not tainted 6.18.0-rc2-g0555b5424915-dirty #153
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> __schedule
> schedule
> schedule_timeout
> wait_woken
> raid5_make_request
> md_handle_request
> md_submit_bio
> [...]
> blkdev_read_iter
> vfs_read
> ksys_read
> __x64_sys_read
>
> It is triggered by:
> 1) normal IO waits for reshape to progress
> 2) user sets ACTION_FROZEN via ioctl
> 3) reshape is interrupted and cannot restart
> 4) users try to suspend array while active IO waits reshape
>
> Following Kuai's previous fix, such IOs should fail in
> make_stripe_request(). Thus, set a timeout for wait_woken() to fix
> the deadlock, and blocked IO will fail in the next cycle.
>
> Signed-off-by: Li Nan <linan122@...wei.com>
> ---
> drivers/md/raid5.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cdbc7eba5c54..957e712d2be9 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6185,7 +6185,7 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
> }
>
> wait_woken(&wait, TASK_UNINTERRUPTIBLE,
> - MAX_SCHEDULE_TIMEOUT);
> + msecs_to_jiffies(10000));
Instead of this change to wake up every 10s unconditionally, can you fix this by wake up
synchronously when array is frozen or suspended that reshape can't continue.
> continue;
> }
>
--
Thansk,
Kuai
Powered by blists - more mailing lists