lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 5 Sep 2009 22:40:37 +0200
From:	Fabio Checconi <fchecconi@...il.com>
To:	Anirban Sinha <ASinha@...gmasystems.com>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: question on sched-rt group allocation cap: sched_rt_runtime_us

> From: Anirban Sinha <ASinha@...gmasystems.com>
> Date: Fri, Sep 04, 2009 05:55:15PM -0700
>
> Hi Ingo and rest:
> 
> I have been playing around with the sched_rt_runtime_us cap that can be
> used to limit the amount of CPU time allocated towards scheduling rt
> group threads. I am using 2.6.26 with CONFIG_GROUP_SCHED disabled (we
> use only the root user in our embedded setup). I have no other CPU
> intensive workloads (RT or otherwise) running on my system. I have
> changed no other scheduling parameters from /proc. 
> 
> I have written a small test program that:
> 
> (a) forks two threads, one SCHED_FIFO and one SCHED_OTHER (this thread
> is reniced to -20) and ties both of them to a specific core.
> (b) runs both the threads in a tight loop (same number of iterations for
> both threads) until the SCHED_FIFO thread terminates.
> (c) calculates the number of completed iterations of the regular
> SCHED_OTHER thread against the fixed number of iterations of the
> SCHED_FIFO thread. It then calculates a percentage based on that.
> 
> I am running the above workload against varying sched_rt_runtime_us
> values (200 ms to 700 ms) keeping the sched_rt_period_us constant at
> 1000 ms. I have also experimented a little bit by decreasing the value
> of sched_rt_period_us (thus increasing the sched granularity) with no
> apparent change in behavior. 
> 
> My observations are listed in tabular form: 
> 
> Ratio of                  # of completed iterations of reg thread /
> sched_rt_runtime_us /     # of iterations of RT thread (in %)
> sched_rt_runtime_us       
> 
> 0.2                      100 % (regular thread completed all its
> iterations).
> 0.3                      73 %
> 0.4                      45 %
> 0.5                      17 %
> 0.6                      0 % (SCHED_OTHER thread completely throttled.
> Never ran)
> 0.7                      0 %
> 
> This result kind of baffles me. Even when we cap the RT group to a
> fraction of 0.6 of overall CPU time, the rest 0.4 \should\ still be
> available for running regular threads. So my SCHED_OTHER \should\ make
> some progress as opposed to being completely throttled. Similarly, with
> any fraction less than 0.5, the SCHED_OTHER should complete before
> SCHED_FIFO.
> 
> I do not have an easy way to verify my results over the latest kernel
> (2.6.31). Was there any regressions in the scheduling subsystem in
> 2.6.26? Can this behavior be explained? Do we need to tweak any other
> /proc parameters?
> 

You say you pin the threads to a single core: how many cores does your
system have?

I don't know if 2.6.26 had anything wrong (from a quick look the relevant
code seems similar to what we have now), but something like that can be
the consequence of the runtime migration logic moving bandwidth from a
second core to the one executing the two tasks.

If this is the case, this behavior is the expected one, the scheduler
tries to reduce the number of migrations, concentrating the bandwidth
of rt tasks on a single core.  With your workload it doesn't work well
because runtime migration has freed the other core(s) from rt bandwidth,
so these cores are available to SCHED_OTHER ones, but your SCHED_OTHER
thread is pinned and cannot make use of them.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ