lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92690eb9158c1019dc0945f8298800cad17cae05.camel@codethink.co.uk>
Date: Mon, 19 May 2025 15:32:27 +0200
From: Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk>
To: luca abeni <luca.abeni@...tannapisa.it>
Cc: Juri Lelli <juri.lelli@...hat.com>, linux-kernel@...r.kernel.org, Ingo
 Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Vineeth
 Pillai	 <vineeth@...byteword.org>
Subject: Re: SCHED_DEADLINE tasks missing their deadline with
 SCHED_FLAG_RECLAIM jobs in the mix (using GRUB)

Hi Luca

Thanks and sorry, for my late reply. I was traveling the Cretan wilderness without access to any work related
infrastructure.

On Wed, 2025-05-07 at 22:25 +0200, luca abeni wrote:
> Hi Marcel,
> 
> just a quick question to better understand your setup (and check where
> the issue comes from):
> in the email below, you say that tasks are statically assigned to
> cores; how did you do this? Did you use isolated cpusets,

Yes, we use the cpuset controller from the cgroup-v2 APIs in the linux kernel in order to partition CPUs and
memory nodes. In detail, we use the AllowedCPUs and
AllowedMemoryNodes in systemd's slice configurations.

> or did you
> set the tasks affinities after disabling the SCHED_DEADLINE admission
> control (echo -1 > /proc/sys/kernel/sched_rt_runtime_us)?

No.

> Or am I misunderstanding your setup?

No, I don't think so.

> Also, are you using HRTICK_DL?

No, not that I am aware of and definitely not on ROCK5Bs while our amd64 configuration currently does not even
enable SCHED_DEBUG. Not sure how to easily judge the specific HRTICK feature set in such case.

> 			Thanks,
> 				Luca

Thank you very much!

Cheers

Marcel

> On Sat, 03 May 2025 13:14:53 +0200
> Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk> wrote:
> [...]
> > We currently use three cores as follows:
> > 
> > #### core x
> > 
> > > sched_deadline = sched_period | sched_runtime | CP max run time 90%
> > of sched_runtime | utilisation | reclaim | | -- | -- | -- | -- | -- |
> > >  5 ms  | 0.15 ms | 0.135 ms |  3.00% | no |
> > > 10 ms  | 1.8 ms  | 1.62 ms  | 18.00% | no |
> > > 10 ms  | 2.1 ms  | 1.89 ms  | 21.00% | no |
> > > 14 ms  | 2.3 ms  | 2.07 ms  | 16.43% | no |
> > > 50 ms  | 8.0 ms  | 7.20 ms  | 16:00% | no |
> > > 10 ms  | 0.5 ms  | **1      |  5.00% | no |
> > 
> > Total utilisation of core x is 79.43% (less than 100%)
> > 
> > **1 - this shall be a rogue process. This process will
> >  a) run for the maximum allowed workload value 
> >  b) do not collect execution data
> > 
> > This last rogue process is the one which causes massive issues to the
> > rest of the scheduling if we set it to do reclaim.
> > 
> > #### core y
> > 
> > > sched_deadline = sched_period | sched_runtime | CP max run time 90%
> > of sched_runtime | utilisation | reclaim | | -- | -- | -- | -- | -- |
> > >  5 ms  | 0.5 ms | 0.45 ms | 10.00% | no |
> > > 10 ms  | 1.9 ms | 1.71 ms | 19.00% | no |
> > > 12 ms  | 1.8 ms | 1.62 ms | 15.00% | no |
> > > 50 ms  | 5.5 ms | 4.95 ms | 11.00% | no |
> > > 50 ms  | 9.0 ms | 8.10 ms | 18.00% | no |
> > 
> > Total utilisation of core y is 73.00% (less than 100%)
> > 
> > #### core z
> > 
> > The third core is special as it will run 50 jobs with the same
> > configuration as such:
> > 
> > > sched_deadline = sched_period | sched_runtime | CP max run time 90%
> > of sched_runtime | utilisation | | -- | -- | -- | -- |
> > >  50 ms  | 0.8 ms | 0.72 ms | 1.60% |
> > 
> > jobs 1-50 should run with reclaim OFF
> > 
> > Total utilisation of core y is 1.6 * 50 = 80.00% (less than 100%)
> > 
> > Please let me know if you need any further details which may help
> > figuring out what exactly is going on.
> > 
> > > Adding Luca in Cc so he can also take a look.
> > > 
> > > Thanks,  
> > 
> > Thank you!
> > 
> > > Juri  
> > 
> > Cheers
> > 
> > Marcel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ