lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 14 Jan 2020 17:33:50 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Steven Rostedt' <rostedt@...dmis.org>
CC:     'Vincent Guittot' <vincent.guittot@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: sched/fair: scheduler not running high priority process on idle
 cpu

From: Steven Rostedt
> Sent: 14 January 2020 16:59
> 
> On Tue, 14 Jan 2020 16:50:43 +0000
> David Laight <David.Laight@...LAB.COM> wrote:
> 
> > I've a test that uses four RT priority processes to process audio data every 10ms.
> > One process wakes up the other three, they all 'beaver away' clearing a queue of
> > jobs and the last one to finish sleeps until the next tick.
> > Usually this takes about 0.5ms, but sometimes takes over 3ms.
> >
> > AFAICT the processes are normally woken on the same cpu they last ran on.
> > There seems to be a problem when the selected cpu is running a (low priority)
> > process that is looping in kernel [1].
> > I'd expect my process to be picked up by one of the idle cpus, but this
> > doesn't happen.
> > Instead the process sits in state 'waiting' until the active processes sleeps
> > (or calls cond_resched()).
> >
> > Is this really the expected behaviour?????
> 
> It is with CONFIG_PREEMPT_VOLUNTARY. I think you want to recompile your
> kernel with CONFIG_PREEMPT. The idea is that the RT task will continue
> to run on the CPU it last ran on, and would push off the lower priority
> task to the idle CPU. But CONFIG_PREEMPT_VOLUNTARY means that this
> will have to wait for the running task to not be in kernel context or
> hit a cond_resched() which is the "voluntary" scheduling point.

I have added a cond_resched() to the offending loop, but a close look implies
that code is called with a lock held in another (less common) path so that
can't be directly committed and so CONFIG_PREEMPT won't help.

Indeed requiring CONFIG_PREEMPT doesn't help when customers are running
the application, nor (probably) on AWS since I doubt it is ever the default.

Does the same apply to non-RT tasks?
I can select almost any priority, but RT ones are otherwise a lot better.

I've also seen RT processes delayed by the network stack 'bh' that runs
in a softint from the hardware interrupt.
That can take a while (clearing up tx and refilling rx) and I don't think we
have any control over the cpu it runs on?

The cost of ftrace function call entry/exit (about 200 clocks) makes it
rather unsuitable for any performance measurements unless only
a very few functions are traced - which rather requires you know
what the code is doing :-(

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ