linux-ext4 - Re: dm: use queue_limits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZkuFuqo3dNw8bOA2@kernel.org>
Date: Mon, 20 May 2024 13:17:46 -0400
From: Mike Snitzer <snitzer@...nel.org>
To: Christoph Hellwig <hch@....de>
Cc: Theodore Ts'o <tytso@....edu>, dm-devel@...ts.linux.dev,
	fstests@...r.kernel.org, linux-ext4@...r.kernel.org,
	regressions@...ts.linux.dev, linux-block@...r.kernel.org
Subject: Re: dm: use queue_limits_set

[replying for completeness to explain what I think is happening for
the issue Ted reported]

On Mon, May 20, 2024 at 11:54:53AM -0400, Mike Snitzer wrote:
> On Mon, May 20, 2024 at 05:44:25PM +0200, Christoph Hellwig wrote:
> > On Mon, May 20, 2024 at 11:39:14AM -0400, Mike Snitzer wrote:
> > > That's fair.  My criticism was more about having to fix up DM targets
> > > to cope with the new normal of max_discard_sectors being set as a
> > > function of max_hw_discard_sectors and max_user_discard_sectors.
> > > 
> > > With stacked devices in particular it is _very_ hard for the user to
> > > know their exerting control over a max discard limit is correct.
> > 
> > The user forcing a limit is always very sketchy, which is why I'm
> > not a fan of it.
> > 
> > > Yeah, but my concern is that if a user sets a value that is too low
> > > it'll break targets like DM thinp (which Ted reported).  So forcibly
> > > setting both to indirectly set the required max_discard_sectors seems
> > > necessary.

Could also be that a user sets the max discard too large (e.g. larger
than thinp's BIO_PRISON_MAX_RANGE).

> > Dm-think requiring a minimum discard size is a rather odd requirement.
> > Is this just a debug asswert, or is there a real technical reason
> > for it?  If so we can introduce a now to force a minimum size or
> > disable user setting the value entirely. 
> 
> thinp's discard implementation is constrained by the dm-bio-prison's
> constraints.  One of the requirements of dm-bio-prison is that a
> discard not exceed BIO_PRISON_MAX_RANGE.
> 
> My previous reply is a reasonible way to ensure best effort to respect
> a users request but that takes into account the driver provided
> discard_granularity.  It'll force suboptimal (too small) discards be
> issued but at least they'll cover a full thinp block.

Given below, this isn't at the heart of the issue Ted reported.  So
the change to ensure max_discard_sectors is a factor of
discard_granularity, while worthwhile, isn't critical to fixing the
reported issue.

> > > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> > > index 4793ad2aa1f7..c196f39579af 100644
> > > --- a/drivers/md/dm-thin.c
> > > +++ b/drivers/md/dm-thin.c
> > > @@ -4497,7 +4499,8 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
> > >  
> > >  	if (pool->pf.discard_enabled) {
> > >  		limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT;
> > > -		limits->max_discard_sectors = pool->sectors_per_block * BIO_PRISON_MAX_RANGE;
> > > +		limits->max_hw_discard_sectors = limits->max_user_discard_sectors =
> > > +			pool->sectors_per_block * BIO_PRISON_MAX_RANGE;
> > >  	}
> > 
> > Drivers really have no business setting max_user_discard_sector,
> > the whole point of the field is to separate device/driver capabilities
> > from user policy.  So if dm-think really has no way of handling
> > smaller discards, we need to ensure they can't be set.
> 
> It can handle smaller so long as they respect discard_granularity.
> 
> > I'm also kinda curious what actually sets a user limit in Ted's case
> > as that feels weird.
> 
> I agree, not sure... maybe the fstests is using the knob?

Doubt there was anything in fstests setting max discard user limit
(max_user_discard_sectors) in Ted's case. blk_set_stacking_limits()
sets max_user_discard_sectors to UINT_MAX, so given the use of
min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) I
suspect blk_stack_limits() stacks up max_discard_sectors to match the
underlying storage's max_hw_discard_sectors.

And max_hw_discard_sectors exceeds BIO_PRISON_MAX_RANGE, resulting in
dm_cell_key_has_valid_range() triggering on:
WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE)

Mike