lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250311110034.53959031@erd003.prtnl>
Date: Tue, 11 Mar 2025 11:00:34 +0100
From: David Jander <david@...tonic.nl>
To: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Cc: Kent Gibson <warthog618@...il.com>, Linus Walleij
 <linus.walleij@...aro.org>, linux-gpio@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: regression: gpiolib: switch the line state notifier to atomic
 unexpected impact on performance


Dear Bartosz,

I noticed this because after updating the kernel from 6.11 to 6.14 a
user-space application that uses GPIOs heavily started getting extremely slow,
to the point that I will need to heavily modify this application in order to
be usable again.
I traced the problem down to the following patch that went into 6.13:

fcc8b637c542 gpiolib: switch the line state notifier to atomic

What happens here, is that gpio_chrdev_release() now calls
atomic_notifier_chain_unregister(), which uses RCU, and as such must call
synchronize_rcu(). synchronize_rcu() waits for the RCU grace time to expire
before returning and according to the documentation can cause a delay of up to
several milliseconds. In fact it seems to take between 8-10ms on my system (an
STM32MP153C single-core Cortex-A7).

This has the effect that the time it takes to call close() on a /dev/gpiochipX
takes now ~10ms each time. If I git-revert this commit, close() will take less
than 1ms.

10ms doesn't sound like much, but it is more ~10x the time it tool before,
and unfortunately libgpiod code calls this function very often in some places,
especially in find_line() if your board has many gpiochips (mine has 16
chardevs).

The effect can easily be reproduced with the gpiofind tool:

Running on kernel 6.12:

$ time gpiofind LPOUT0
gpiochip7 9
real    0m 0.02s
user    0m 0.00s
sys     0m 0.01s

Running on kernel 6.13:

$ time gpiofind LPOUT0
gpiochip7 9
real    0m 0.19s
user    0m 0.00s
sys     0m 0.01s

That is almost a 10x increase in execution time of the whole program!!

On kernel 6.13, after git revert -n fcc8b637c542 time is back to what it was
on 6.12.

Unfortunately I can't come up with an easy solution to this problem, that's
why I don't have a patch to propose. Sorry for that.

I still think it is a bit alarming this change has such a huge impact. IMHO it
really shouldn't. What can be done about this? Is it maybe possible to defer
unregistering and freeing to a kthread and return from the release function
earlier?

Best regards,

-- 
David Jander

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ