[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zocc+6nIQzfUTPpd@dread.disaster.area>
Date: Fri, 5 Jul 2024 08:06:51 +1000
From: Dave Chinner <david@...morbit.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Ryan Roberts <ryan.roberts@....com>,
"Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>,
chandan.babu@...cle.com, djwong@...nel.org, brauner@...nel.org,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
yang@...amperecomputing.com, linux-mm@...ck.org,
john.g.garry@...cle.com, linux-fsdevel@...r.kernel.org,
hare@...e.de, p.raghav@...sung.com, mcgrof@...nel.org,
gost.dev@...sung.com, cl@...amperecomputing.com,
linux-xfs@...r.kernel.org, hch@....de, Zi Yan <zi.yan@...t.com>
Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes
On Thu, Jul 04, 2024 at 04:20:13PM +0100, Matthew Wilcox wrote:
> On Thu, Jul 04, 2024 at 01:23:20PM +0100, Ryan Roberts wrote:
> > > - AS_LARGE_FOLIO_SUPPORT = 6,
> >
> > nit: this removed enum is still referenced in a comment further down the file.
>
> Thanks. Pankaj, let me know if you want me to send you a patch or if
> you'll do it directly.
>
> > > + /* Bits 16-25 are used for FOLIO_ORDER */
> > > + AS_FOLIO_ORDER_BITS = 5,
> > > + AS_FOLIO_ORDER_MIN = 16,
> > > + AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS,
> >
> > nit: These 3 new enums seem a bit odd.
>
> Yes, this is "too many helpful suggestions" syndrome. It made a lot
> more sense originally.
>
> https://lore.kernel.org/linux-fsdevel/ZlUQcEaP3FDXpCge@dread.disaster.area/
>
> > > +static inline void mapping_set_folio_order_range(struct address_space *mapping,
> > > + unsigned int min,
> > > + unsigned int max)
> > > +{
> > > + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> > > + return;
> > > +
> > > + if (min > MAX_PAGECACHE_ORDER)
> > > + min = MAX_PAGECACHE_ORDER;
> > > + if (max > MAX_PAGECACHE_ORDER)
> > > + max = MAX_PAGECACHE_ORDER;
> > > + if (max < min)
> > > + max = min;
> >
> > It seems strange to silently clamp these? Presumably for the bs>ps usecase,
> > whatever values are passed in are a hard requirement? So wouldn't want them to
> > be silently reduced. (Especially given the recent change to reduce the size of
> > MAX_PAGECACHE_ORDER to less then PMD size in some cases).
>
> Hm, yes. We should probably make this return an errno. Including
> returning an errno for !IS_ENABLED() and min > 0.
What are callers supposed to do with an error? In the case of
setting up a newly allocated inode in XFS, the error would be
returned in the middle of a transaction and so this failure would
result in a filesystem shutdown.
Regardless, the filesystem should never be passing min >
MAX_PAGECACHE_ORDER any time soon for bs > ps configurations. block
sizes go up to 64kB, which is a lot smaller than
MAX_PAGECACHE_ORDER. IOWs, seeing min > MAX_PAGECACHE_ORDER is
indicative of a severe bug, should be considered a fatal developer
mistake and the kernel terminated immediately. Such mistakes should
-never, ever- happen on productions systems. IOWs, this is a
situation where we should assert or bug and kill the kernel
immediately, or at minimum warn-on-once() and truncate the value and
hope things don't get immediately worse.
If we kill the kernel because min is out of range, the system will
fail on the first inode instantiation on that filesystem.
Filesystem developers should notice that sort of failure pretty
quickly and realise they've done something that isn't currently
supported...
-Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists