lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YV1EO9dsVSwWW7ua@dhcp22.suse.cz>
Date:   Wed, 6 Oct 2021 08:37:47 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Antoine Tenart <atenart@...nel.org>
Cc:     davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
        gregkh@...uxfoundation.org, ebiederm@...ssion.com,
        stephen@...workplumber.org, herbert@...dor.apana.org.au,
        juri.lelli@...hat.com, netdev@...r.kernel.org,
        Jiri Bohac <jbohac@...e.cz>
Subject: Re: [RFC PATCH net-next 0/9] Userspace spinning on net-sysfs access

On Tue 28-09-21 14:54:51, Antoine Tenart wrote:
> Hello,

Hi,
thanks for posting this. Coincidentally we have come across a similar
problem as well just recently.

> What made those syscalls to spin is the following construction (which is
> found a lot in net sysfs and sysctl code):
> 
>   if (!rtnl_trylock())
>           return restart_syscall();

One of our customer is using Prometeus (https://github.com/prometheus/prometheus)
for monitoring and they have noticed that running several instances of
node-exporter can lead to a high CPU utilization. After some
investigation it has turned out that most instances are busy looping on
on of the sysfs files while one instance is processing sysfs speed file
for mlx driver which performs quite a convoluted set of operations (send
commands to the lower layers via workqueues) to talk to the device to
get the information.

The problem becomes more visible with more instance of node-exporter
running at parallel. This results in some monitoring alarms at the said
machine because the high CPU utilization is not expected.

I would appreciate if you CC me on next versions of this patchset.

Thankis for working on this!
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ