linux-kernel - Re: schbench v1.0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230420185606.GA1148774@hirez.programming.kicks-ass.net>
Date:   Thu, 20 Apr 2023 20:56:06 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Chris Mason <clm@...a.com>
Cc:     David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org,
        kernel-team@...com, Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        gautham.shenoy@....com
Subject: Re: schbench v1.0

On Thu, Apr 20, 2023 at 05:05:37PM +0200, Peter Zijlstra wrote:

> EEVDF base_slice = 3000[us] (default)
> 
> schbench -m2 -F128 -n10	-r90	OTHER	BATCH
> Wakeup  (usec): 99.0th:		3820	6968
> Request (usec): 99.0th:		30496	24608
> RPS    (count): 50.0th:		3836	5496
> 
> EEVDF base_slice = 6440[us] (per the calibrate run)
> 
> schbench -m2 -F128 -n10	-r90	OTHER	BATCH
> Wakeup  (usec): 99.0th:		9136	6232
> Request (usec): 99.0th:		21984	12944
> RPS    (count): 50.0th:		4968	6184
> 
> 
> With base_slice >= request and BATCH (disables wakeup preemption), the
> EEVDF thing should turn into FIFO-queue, which is close to ideal for
> your workload.
> 
> For giggles:
> 
> echo 6440000 > /debug/sched/base_slice_ns
> echo NO_PLACE_LAG > /debug/sched/features
> chrt -b 0 ./schbench -m2 -F128 -n10 -r90

FWIW a similar request size can be achieved through using latency-nice-5

  latency-nice-4 gives 3000*1024/526 ~ 5840[us], while
  latency-nice-5 gives 3000*1024/423 ~ 7262[us].

Which of course raises the question if we should instead of latency-nice
expose sched_attr::slice (with some suitable bounds).

The immediate problem of course being that while latency-nice is nice
(harhar, teh pun) and vague, sched_attr::slice is fairly well defined.
OTOH as per this example, it might be easier for software to request a
specific slice length (based on prior runs etc..) than it is to guess at
a nice value.

The direct correlation between smaller slice and latency might not be
immediately obvious either, nor might it be a given for any given
scheduling policy.

Also, cgroups :/