lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <c22a0f1c-883d-5122-ef88-0d7c57ab4e66@pengutronix.de>
Date:   Tue, 15 Mar 2022 17:23:00 +0100
From:   Ahmad Fatoum <a.fatoum@...gutronix.de>
To:     squashfs-devel@...ts.sourceforge.net,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Cc:     Pengutronix Kernel Team <kernel@...gutronix.de>
Subject: Possible performance regression with CONFIG_SQUASHFS_DECOMP_SINGLE

Hello,

This an issue we had with v5.15 that we have since successfully worked around.
I am reporting it here as a pointer in case someone else runs into this and as
a heads up that there seems to be an underlying performance regression, so
here it goes:

We have an i.MX8MM (4x Cortex-A53) system with squashfs on eMMC as a root file
system. The system originally ran NXP's "imx_5.4.24_2.1.0" which has about
5000 patches on top of upstream v5.4.24 including PREEMPT_RT.

The system was updated to mainline Linux + PREEMPT_RT and boot time suffered
considerably growing from 40s with vendor kernel to 1m20s with mainline-based
kernel.

The slowdown on mainline was reproducible for all scheduling models (with
or without PREEMPT_RT) except for PREEMPT_NONE, which was back at 40s.

The services most impacted by the slowdown were C++ applications with many
shared libraries dynamically loaded from the rootfs.

Looking through the original kernel configuration we found that it has
CONFIG_SQUASHFS_DECOMP_SINGLE=y and CONFIG_SQUASHFS_FILE_CACHE=y.

Once changed to CONFIG_SQUASHFS_FILE_DIRECT=y and
CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y, we were below 40s as we want.

That's clearly the preferred configuration and it resolves our problem
It doesn't solve the underlying issue though:

  - CONFIG_PREEMPT_VOLUNTARY performs much worse than CONFIG_PREEMPT_NONE
    for some workloads when CONFIG_SQUASHFS_DECOMP_SINGLE=y

  - And this might not have been the case with v5.4. Unfortunately we can't
    bisect, because there wasn't enough i.MX8MM support mainline back then
    to boot the system. Earliest mainline-based kernel we reproduced this on
    was v5.11.

TL;DR: Check if CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y in your configuration

Cheers,
Ahmad

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ