lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <2a1024f7-af7f-474f-8b1c-aa5e0d4bd17a@app.fastmail.com>
Date: Wed, 24 Jul 2024 10:50:54 +0200
From: "Arnd Bergmann" <arnd@...nel.org>
To: "Juergen Gross" <jgross@...e.com>,
 "Lorenzo Stoakes" <lorenzo.stoakes@...cle.com>,
 "Andrew Morton" <akpm@...ux-foundation.org>,
 "David Laight" <david.laight@...lab.com>
Cc: "Matthew Wilcox" <willy@...radead.org>,
 "Linus Torvalds" <torvalds@...ux-foundation.org>,
 "Jason A . Donenfeld" <Jason@...c4.com>,
 "Christoph Hellwig" <hch@...radead.org>,
 "Andy Shevchenko" <andriy.shevchenko@...ux.intel.com>,
 pedro.falcato@...il.com, "Mateusz Guzik" <mjguzik@...il.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: Build performance regressions originating from min()/max() macros

On Wed, Jul 24, 2024, at 10:14, Jürgen Groß wrote:
> On 23.07.24 23:59, Lorenzo Stoakes wrote:
>> 
>> And resulted in the generation of 47 MB (!) of pre-processor output.
>> 
>> It seems a lot of code now relies on the relaxed conditions of the newly
>> changed min/max() macros, so the question is - what can we do to address
>> these regressions?
>
> I can send a patch to simplify the problematic construct, but OTOH this
> will avoid only one particularly bad example.

It's probably a good idea do change the xen/setup.c file anyway,
as I haven't found any other file that had a regression this bad,
and it only needs a single temporary variable for a 1000x speedup.

For the overall kernel, I see at best a 2.3% speedup (20 second 
CPU time) by replacing the current min()/max() macros with a version
that drops both the constant expression output feature and the
assertion, measuring an x86 defconfig build, which has xen
disabled. On a defconfig+xen kernel, that difference increases
to 4.4% or 37 seconds.

Removing only the constexpr side requires a handful of fixups
for x86 allmodconfig to replace min()/max() with something else in

drivers/edac/sb_edac.c
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
drivers/gpu/drm/drm_color_mgmt.c
drivers/input/touchscreen/cyttsp4_core.c
drivers/md/dm-integrity.c
drivers/net/can/usb/etas_es58x/es58x_devlink.c
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
fs/btrfs/tree-checker.c
lib/vsprintf.c
net/ipv4/proc.c
net/ipv6/proc.c

This gives about half the speed difference, the other
half comes from removing the assertion, but that is not
a good idea unless we can replace it with an equivalent
assertion that works on the unique_x/unique_y variables
instead of expanding the arguments.

     Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ