lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <bd95a0f0-5589-2d9e-8fb0-a66322e556e4@scylladb.com>
Date:   Wed, 30 Mar 2022 14:01:21 +0300
From:   Avi Kivity <avi@...lladb.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Asias He <asias@...lladb.com>, linux-kernel@...r.kernel.org
Subject: sched_min_granuality_ns exile into debugfs

Hi Peter,


In 8a99b683 ("sched: Move SCHED_DEBUG sysctl to debugfs"), you moved

sched_min_granularity_ns to debugfs, citing that it is debug-only (true)

and undocumented (it is documented in sched-design-CFS.rst, under

the old name).


This breaks my application, Scylla[1]. We use sched_min_granularity_ns

to reduce the chances that a high networking backlog will starve the

application thread. It is a thread-per-core design, so we won't find another

core for the application, they are all busy (and besides, the application

threads are pinned).


In addition to sched_min_granularity_ns, we also tune a few other

sysctls:


# Prevent auto-scaling from doing anything to our tunables
kernel.sched_tunable_scaling = 0

# Preempt sooner
kernel.sched_min_granularity_ns = 500000

# Don't delay unrelated workloads
kernel.sched_wakeup_granularity_ns = 450000

# Schedule all tasks in this period
kernel.sched_latency_ns = 1000000

# autogroup seems to prevent sched_latency_ns from being respected
kernel.sched_autogroup_enabled = 0

# Disable numa balancing
kernel.numa_balancing = 0


While we can adapt to the move, I would much prefer it if the old location

was restored. I think it even makes sense to make this a non-debug tunable;

it helps to application to be more responsive without using the realtime

class, which is its own can of worms (and will likely result in reduced 
throughput).


[1] https://github.com/scylladb/scylla

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ