lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <48ee3f26-7dbc-4c59-b98d-f9aeed980a43@redhat.com>
Date: Sat, 1 Nov 2025 08:43:51 +0000 (UTC)
From: Gabriele Monaco <gmonaco@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...hat.com>, Clark Williams <williams@...hat.com>,
	arighi@...dia.com
Subject: Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with
 expired deadline

2025-11-01T00:08:37Z Peter Zijlstra <peterz@...radead.org>:

> On Fri, Oct 31, 2025 at 04:41:22PM +0100, Gabriele Monaco wrote:
>> On Fri, 2025-10-31 at 16:20 +0100, Peter Zijlstra wrote:
>>> On Fri, Oct 31, 2025 at 02:24:17PM +0100, Gabriele Monaco wrote:
>>>>
>>>> Different scenario if I have the CPU busy with other tasks (e.g. RT
>>>> policies), there I can see the server stopping and starting again.
>>>> After I do this I seem to get a different behaviour (even some boosting
>>>> after idle), I'm trying to understand what's going on.
>>>>
>>
>> After running some heavy RT workload (stress-ng --cpu 10 --sched rr) I do see
>> the server stopping and starting as the models would expect, but somehow it's
>> always boosting as soon as it's started.
>>
>> Apparently dl_defer_running is always 1 in that scenario. Perhaps running idle
>> counts as running something too, so it never defers. But I can't really see how
>> this happens..
>
> The transition [4], will retain dl_defer_running, such that a timely
> re-start of the dl_server can immediately run again.

Alright I worded it poorly. As far as I understand, what you mentioned is desired behaviour when handling starvation. We don't defer and start the next period boosting.
What I was observing was the server staying running indefinitely.
I run a test with 5s of RR stress-ng and 30s of mostly idle DL workload on a clean VM. I expect boosting only during the first 5 seconds, but I see it also after, where there was clearly no starvation (system was idle, probably a bit hard to see from the trace I shared).

Thanks for the updated patch, I'll try that and see how it goes.

Gabriele


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ