[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZkmRKPfPeX3c138f@kernel.org>
Date: Sun, 19 May 2024 01:42:00 -0400
From: Mike Snitzer <snitzer@...nel.org>
To: Theodore Ts'o <tytso@....edu>
Cc: dm-devel@...ts.linux.dev, fstests@...r.kernel.org,
linux-ext4@...r.kernel.org, regressions@...ts.linux.dev,
Christoph Hellwig <hch@....de>, linux-block@...r.kernel.org
Subject: Re: dm: use queue_limits_set
On Sun, May 19, 2024 at 01:05:40AM -0400, Mike Snitzer wrote:
> Hi Ted,
>
> On Fri, May 17, 2024 at 10:26:46PM -0400, Theodore Ts'o wrote:
> > #regzbot introduced: 1c0e720228ad
> >
> > While doing final regression testing before sending a pull request for
> > the ext4 tree, I found a regression which was triggered by generic/347
> > and generic/405 on on multiple fstests configurations, including
> > both ext4/4k and xfs/4k.
> >
> > It bisects cleanly to commit 1c0e720228ad ("dm: use
> > queue_limits_set"), and the resulting WARNING is attached below. This
> > stack trace can be seen for both generic/347 and generic/405. And if
> > I revert this commit on top of linux-next, the failure goes away, so
> > it pretty clearly root causes to 1c0e720228ad.
> >
> > For now, I'll add generic/347 and generic/405 to my global exclude
> > file, but maybe we should consider reverting the commit if it can't be
> > fixed quickly?
>
> Commit 1c0e720228ad is a red herring, it switches DM over to using
> queue_limits_set() which I now see is clearly disregarding DM's desire
> to disable discards (in blk_validate_limits).
>
> It looks like the combo of commit d690cb8ae14bd ("block: add an API to
> atomically update queue limits") and 4f563a64732da ("block: add a
> max_user_discard_sectors queue limit") needs fixing.
>
> This being one potential fix from code inspection I've done to this
> point, please see if it resolves your fstests failures (but I haven't
> actually looked at those fstests yet _and_ I still need to review
> commits d690cb8ae14bd and 4f563a64732da further -- will do on Monday,
> sorry for the trouble):
I looked early, this is needed (max_user_discard_sectors makes discard
limits stacking suck more than it already did -- imho 4f563a64732da is
worthy of revert. Short of that, dm-cache-target.c and possibly other
DM targets will need fixes too -- I'll go over it all Monday):
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 4793ad2aa1f7..c196f39579af 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -4099,8 +4099,10 @@ static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits)
if (pt->adjusted_pf.discard_enabled) {
disable_discard_passdown_if_not_supported(pt);
- if (!pt->adjusted_pf.discard_passdown)
- limits->max_discard_sectors = 0;
+ if (!pt->adjusted_pf.discard_passdown) {
+ limits->max_hw_discard_sectors = 0;
+ limits->max_user_discard_sectors = 0;
+ }
/*
* The pool uses the same discard limits as the underlying data
* device. DM core has already set this up.
@@ -4497,7 +4499,8 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
if (pool->pf.discard_enabled) {
limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT;
- limits->max_discard_sectors = pool->sectors_per_block * BIO_PRISON_MAX_RANGE;
+ limits->max_hw_discard_sectors = limits->max_user_discard_sectors =
+ pool->sectors_per_block * BIO_PRISON_MAX_RANGE;
}
}
Powered by blists - more mailing lists