[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aP-TCvzsUS32X9-d@shell.ilvokhin.com>
Date: Mon, 27 Oct 2025 15:43:06 +0000
From: Dmitry Ilvokhin <d@...okhin.com>
To: Yafang Shao <laoar.shao@...il.com>
Cc: Michal Hocko <mhocko@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Hugh Dickins <hughd@...gle.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Kiryl Shutsemau <kas@...nel.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH] mm: shmem/tmpfs hugepage defaults config choice
On Sun, Oct 26, 2025 at 08:12:27PM +0800, Yafang Shao wrote:
> On Fri, Oct 24, 2025 at 7:23 PM Dmitry Ilvokhin <d@...okhin.com> wrote:
> >
> > On Fri, Oct 24, 2025 at 09:38:53AM +0200, Michal Hocko wrote:
> > > On Thu 23-10-25 18:12:02, Dmitry Ilvokhin wrote:
> > > > Allow to override defaults for shemem and tmpfs at config time. This is
> > > > consistent with how transparent hugepages can be configured.
> > > >
> > > > Same results can be achieved with the existing
> > > > 'transparent_hugepage_shmem' and 'transparent_hugepage_tmpfs' settings
> > > > in the kernel command line, but it is more convenient to define basic
> > > > settings at config time instead of changing kernel command line later.
> > >
> > > Being consistent is usually nice but you are not telling us _who_ is
> > > going to benefit from this. Increasing the config space is not really
> > > free. So please focus on Why do we need it rather than it is consistent
> > > argument.
> >
> > Thanks for the feedback, Michal, totally make sense to me, I should have
> > expand on this point in the initial commit message.
> >
> > Primary motivation for adding config option is to enable policy
> > enforcement at build time. In large-scale production environments
> > (Meta's for example), the kernel configuration is often maintained
> > centrally close to the kernel code itself and owned by the kernel
> > engineers, while boot parameters are managed independently (e.g. by
> > provisioning systems). In such setups, the kernel build defines the
> > supported and expected behavior in a single place, but there is no
> > reliable or uniform control over the kernel command line options.
> >
> > A build-time default allows kernel integrators to enforce a predictable
> > hugepage policy for shmem/tmpfs on a base layer, ensuring reproducible
> > behavior and avoiding configuration drift caused by possible boot-time
> > differences.
>
> I'd like to better understand your kernel deployment strategy. Are you
> maintaining separate kernel images for different environments in your
> fleet? We've found that this approach can introduce significant
> maintenance complexity in the build system.
Thanks for the feedback, Yafang. To clarify, our goal isn't to maintain
separate kernel images for different environments, as we also prefer to
standardize on a single kernel binary wherever possible.
What we'd like to achieve with this change is a consistent baseline
policy for shmem/tmpfs at the lowest possible layer. In particular, we’d
like shmem/tmpfs hugepage usage to be an opt-out rather than an opt-in
behavior. That is, the kernel would default (likely madvise or
within_size, not to always) to using hugepages for shmem/tmpfs unless
explicitly disabled. This ensures desired behavior out of the box, while
still allowing overrides through boot parameters if needed for specific
environments.
>
> In our practice, we standardize on a single kernel image across all
> environments and handle variations through dynamic boot parameters.
> This approach is quite straightforward to implement. If you're
> concerned about uncontrolled environments, you could set default
> values like shmem_enabled and tmpfs_enabled to 'never', then
> explicitly enable them only in approved environments.
>
> >
> > In short, primary benefit is mostly operational: it provides a way to
> > codify preferred policy in the kernel configuration, which is versioned,
> > reviewed, and tested as part of the kernel build process, rather than
> > depending on potentially variable boot parameters.
> >
> > I hope possible operational benefits outweigh downsides from increasing
> > the config space. Please, let me know if this argument sounds
> > reasonable to you, I'll rephrase commit message for v2 to include this
> > reasoning.
> >
>
> --
> Regards
> Yafang
Powered by blists - more mailing lists