lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJDIZNyoehnzIqS2@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 4 Aug 2025 15:49:08 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: David Haufe <dhaufe@...plextrading.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Kernel 6.14.11 dl_server_timer(...) causing IPI/Function Call
 Interrupts on isolcpu/nohz_full cores, performance regression

On 01/08/25 10:28, David Haufe wrote:
> I am sorry, but we cannot get this branch to boot on our hardware.
> Looking through the code of the branch, it will not address the issue.
> I believe the issue is more fundamental. In
> fair.c->enqueue_task_fair(), dl_server_start() is called when the
> single fair/SCHED_OTHER task is added to the isolcpu/nohz_full core.
> The check here is simply checking if there is 1 or more process and
> kicks off the dl_server_start() and the housekeeping timer in
> start_dl_timer(). Once this timer is running, it will invoke
> dl_server_timer() continuously. This timer calls __enqueue_dl_entity()
> and then inc_dl_tasks(). inc_dl_tasks() increments
> dl_rq->dl_nr_running++ and invokes add_nr_running(). This code will
> eventually call the sched_can_stop_tick() function but
> rq->dl.dl_nr_running now != 0, so this function will always return
> false. Something needs to be done to prevent this timer from running
> in the first place, or maybe have some checks around single
> "fair/SCHED_OTHER/etc" process running on an isolcpu/nohz_full core
> which prevents the need for the deadline code to run for the core.

The fix commit I mentioned should at least make entering nohz_full work
again even when the dl_server is active (but deferred). We still have
the 1 dl_server_timer firing each second (after recent additional fix by
Peter), though. At least this is what I am seeing at my end.

Will try to see if we can remove that periodic timer once nohz_full mode
is entered.

Thanks,
Juri


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ