lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aE6tFoBkF3tl9aeH@gmail.com>
Date: Sun, 15 Jun 2025 13:23:02 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, "H . Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Borislav Petkov <bp@...en8.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Vitaly Kuznetsov <vkuznets@...hat.com>,
	Jürgen Groß <jgross@...e.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ard Biesheuvel <ardb@...nel.org>, Arnd Bergmann <arnd@...db.de>,
	David Woodhouse <dwmw@...zon.co.uk>,
	Masahiro Yamada <yamada.masahiro@...ionext.com>,
	Michal Marek <michal.lkml@...kovi.net>,
	Rik van Riel <riel@...riel.com>
Subject: Re: [PATCH 09/13] x86/kconfig/64: Enable popular MM options in the
 defconfig


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Thu, 15 May 2025 at 06:28, Ingo Molnar <mingo@...nel.org> wrote:
> >
> > Since the x86 defconfig aims to be a distro kernel work-alike with
> > fewer drivers and a shorter build time, enable the following
> > MM options that are typically enabled on major Linux distributions:
> 
> Ingo, PLEASE STOP.
> 
> This whole "enable random crap that distros enable" is completely 
> pointless.
> 
> If you want a distro config, then USE the distro config, for 
> chrissake!
> 
> The defconfig should be some sane configuration for NORMAL :PEOPLE.
> 
> Not for cloud providers - get rid of the stipid cloud virt stuff.
> 
> Not for distros - get rid of the silly "distros enable this".
> 
> For NORMAL people. People who don't know what they should do without 
> a default config. People who just have a random machine that they 
> want to run Linux on and need an initial config for.
>
> This whole "enable random things just because a distro has bad taste 
> and enables them" is BROKEN.

Okay, first off, no arguments from me about the way forward - I've just 
nuked the following commits:

  011f3ac16949 ("x86/kconfig/64: Enable the KVM host in the defconfig")
  e76fe3432a2e ("x86/kconfig/64: Enable more virtualization guest options in the defconfig: Enable Xen, Xen_PVH, Jailhouse, ACRN, Intel TDX and Hyper-V")
  1093fbcf57ad ("x86/kconfig/64: Enable BPF support in the defconfig")
  4e96a8b1eb76 ("x86/kconfig/64: Enable popular MM options in the defconfig")
  53bc35f2d937 ("x86/kconfig/64: Enable popular kernel debugging options in the defconfig")
  c0fa33249920 ("x86/kconfig/64: Enable popular scheduler, cgroups and namespaces options in the defconfig")
  475cf81e4fda ("x86/kconfig/64: Enable popular generic kernel options in the defconfig")
  9e3d5f041005 ("x86/kconfig/32: Synchronize the x86-32 defconfig to the x86-64 defconfig")
  c86ec5635d07 ("x86/kconfig: Remove the CONFIG_DRM_I915=y driver from the defconfig")
  7ce421edd9fc ("x86/kconfig/defconfig: Enable CONFIG_DRM_FBDEV_EMULATION=y")

Which is almost all of this series. These commits are not coming back. 
Clearly my approach of using the lowest common denominator of distro 
kernel configs is not appreciated and I have no desire whatsoever to 
fight such pushback.

As a background as to what I was trying to do with this series:

1) Why more hypervisor guest driver enablement? Firstly, a significant 
   percentage of x86 patch contributions come from CPU vendors, cloud 
   vendors and Linux distributions, so I thought it useful to make it 
   easier for all of them to test their changes on their own 
   environments out of the box - and for them to be better aware of any 
   interactions between their environments. Yes, they can each 
   individually enable their own options, but that's not what end users 
   end up using. I didn't think this (or frankly *any*) aspect of the 
   series was particularly controversial, as we already enable support 
   for obscure machine variants in the x86 defconfig:

        CONFIG_CPU_SUP_HYGON=y
        CONFIG_CPU_SUP_CENTAUR=y 
        CONFIG_CPU_SUP_ZHAOXIN=y

   And had a bunch of virtualization guest options enabled in the 
   defconfig as well (before this series):

	starship:~/tip> make defconfig; grep -E 'KVM|VIRT|GUEST|HYPER' .config | grep =y

	CONFIG_HYPERVISOR_GUEST=y
	CONFIG_PARAVIRT=y
	CONFIG_KVM_GUEST=y
	CONFIG_PARAVIRT_CLOCK=y
	CONFIG_VIRTUALIZATION=y
	CONFIG_NET_9P_VIRTIO=y
	CONFIG_VIRTIO_BLK=y
	CONFIG_SCSI_VIRTIO=y
	CONFIG_VIRTIO_NET=y
	CONFIG_VIRTIO_CONSOLE=y
	CONFIG_PTP_1588_CLOCK_KVM=y
	CONFIG_DRM_VIRTIO_GPU=y
	CONFIG_DRM_VIRTIO_GPU_KMS=y
	CONFIG_DMA_VIRTUAL_CHANNELS=y
	CONFIG_VIRTIO_ANCHOR=y
	CONFIG_VIRTIO=y
	CONFIG_VIRTIO_PCI_LIB=y
	CONFIG_VIRTIO_PCI_LIB_LEGACY=y
	CONFIG_VIRTIO_MENU=y
	CONFIG_VIRTIO_PCI=y
	CONFIG_VIRTIO_PCI_ADMIN_LEGACY=y
	CONFIG_VIRTIO_PCI_LEGACY=y
	CONFIG_VIRTIO_INPUT=y
	CONFIG_VIRTIO_DMA_SHARED_BUFFER=y

   Why not make the defconfig work out of the box for the testing 
   environments of a broader group of our actual contributors, as long 
   as the build cost isn't overly high?

   Secondly, even outside of cloud vendors, many kernel developers use 
   some sort of simple virtual environment to test their patches, but 
   our defconfig often doesn't boot & work, while distro kernels mostly 
   work but take a lot of time to build.

   defconfigs are useful if they work, as the difference between a ~30 
   seconds defconfig build and a ~4 minutes distro config build is 
   enormous to test-iteration speed:

     defconfig:                      34.67 seconds time elapsed
     distro config+localmodconfig:   58.07 seconds time elapsed
     allmodconfig+localmodconfig:    90.36 seconds time elapsed
     distro config:                 227.86 seconds time elapsed
     allmodconfig:                  317.60 seconds time elapsed

   And that's on my very fast desktop.

   Even 'make distro-config+localmodconfig', where a tester manually 
   uses a distro config and disables all modules not loaded at the 
   moment, is 2x slower to build in practice. Full distro kernels are 
   6.5x slower to build, allmodconfig kernels 9x slower to build - no 
   surprises there.

   In fact on a typical modern desktop that our developers and testers 
   are using, I'd estimate build times to be more along the lines of:

     defconfig:                      ~60 seconds time elapsed
     distro config+localmodconfig:  ~120 seconds time elapsed
     allmodconfig+localmodconfig:   ~180 seconds time elapsed
     distro config:                 ~440 seconds time elapsed
     allmodconfig:                  ~640 seconds time elapsed

   Third, and building upon the previous point, bisecting a bug that 
   triggers in a distro kernel is a *very* time-consuming process in 
   part due to the very long build times, and very few testers end up 
   being able to (or willing to) do that when they report bugs.

   So, at least on x86, over the years the defconfig has morphed into a 
   kind of lowest common denominator config that is fast to build but 
   which is still mostly relevant to our users. (and obviously it 
   shouldn't enable anything crazy, and any crazy in this series is my 
   fault alone.) The defconfig kernel's code generation quality gets 
   checked, and it gets tested first and can be used for longer 
   bisections.

   After this series, the following build method is actually expected 
   to boot and work on a wide(r) range of x86 systems, physical and 
   virtual systems included, and result in a kernel close to what 
   distro kernels are doing in the field:

     $ make defconfig localyesconfig

   ... and which is still very fast to build:

      34.11 seconds time elapsed

   So this series attempted to broaden the x86 defconfig to more of 
   what our developers and users are using in practice, while staying 
   within the bounds of what our Kconfig space allows and recommends. 
   (And any deviation from that principle is my fault.)


2) The other motivation for this series was that the reality is that 
   99.9% of Linux users use a distro kernel, and our defconfig became 
   rather detached from that reality.

   I've noticed that we Linux kernel developers are in a kind of 
   isolated microcosm with homebrewn configs that have random kernel 
   options enabled/disabled, with the occasional strong opinions about 
   some of those options, and we are often totally unaware of the 
   actual runtime overhead and code generation realities in distro 
   kernels, that 99.9% of our users use every single day... This is a 
   suboptimal social dynamic and indicates a broken development 
   feedback loop IMHO.

   Let's take CONFIG_KSM as an example, which PeterZ says sucks 
   security wise. Yet it's enabled in literally *every* single Linux 
   distribution out there that I managed to check:

     .config.distro.debian.x86_32:        CONFIG_KSM=y
     .config.distro.opensuse.default:     CONFIG_KSM=y
     .config.distro.fedora.generic:       CONFIG_KSM=y
     .config.distro.rhel.generic:         CONFIG_KSM=y
     .config.distro.ubuntu:               CONFIG_KSM=y

   If CONFIG_KSM, which feature was merged upstream 15+ years ago, is 
   indeed unsafe and/or stupid, why is it still in the upstream kernel 
   to begin with? We are effectively denying reality by pretending that 
   it doesn't exist, while 99.9% of our users are using it...

   The Kconfig help text for CONFIG_KSM is ... what appears to be 
   unhelpful and misleading:

     config KSM
        bool "Enable KSM for page merging"
        depends on MMU
        select XXHASH
        help
          Enable Kernel Samepage Merging: KSM periodically scans those areas
          of an application's address space that an app has advised may be
          mergeable.  When it finds pages of identical content, it replaces
          the many instances by a single page with that content, so
          saving memory until one or another app needs to modify the content.
          Recommended for use with KVM, or with other duplicative applications.
          See Documentation/mm/ksm.rst for more information: KSM is inactive
          until a program has madvised that an area is MADV_MERGEABLE, and
          root has set /sys/kernel/mm/ksm/run to 1 (if CONFIG_SYSFS is set).

   It doesn't say that it's unsafe. In fact the upstream kernel's 
   official help text says that this feature is:

     "Recommended for use with KVM, or with other duplicative applications."

   ... which by its plain reading makes it sound useful to testers and 
   distros, with no tradeoffs mentioned whatsoever. Why wouldn't 
   testers and distros enable it?

   That a 'stupid' or 'broken' kernel option is default-disabled in 
   practice has almost no relevance and doesn't help in filtering good 
   kernel options from bad ones: almost all new kernel options, even 
   useful ones we'd like distros to disable, are default-disabled, and 
   stay so even if all distributions end up enabling it.

   Ie. the development feedback loop is somewhat broken in these cases, 
   because by offering a .config feature the upstream kernel tacitly 
   acknowledges distros enabling options that key upstream developers 
   consider 'stupid' or 'broken'.

TL;DR: in this specific example I'm advocating for one of four 
outcomes:

  - Fix CONFIG_KSM if it can be fixed,
  - ... or remove CONFIG_KSM if it's unsafe and cannot be fixed,
  - ... or explain why it's safe and can be enabled,
  - ... or at least *document* that it's unsafe/stupid, so that distros 
        don't end up enabling it ...

Because the status quo of kernel developers often ignoring what 99.9% 
of our users are running and summarily declaring that distro kernels 
enable "stupid" configs while our Kconfig help text describes nothing 
of the sort kinda sucks and is a double standard at best, isn't it?

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ