lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <212fabd759b0486aa8df588477acf6d0@AcuMS.aculab.com>
Date:   Tue, 14 Jan 2020 16:50:43 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Vincent Guittot' <vincent.guittot@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>
CC:     Viresh Kumar <viresh.kumar@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: sched/fair: scheduler not running high priority process on idle cpu

I've a test that uses four RT priority processes to process audio data every 10ms.
One process wakes up the other three, they all 'beaver away' clearing a queue of
jobs and the last one to finish sleeps until the next tick.
Usually this takes about 0.5ms, but sometimes takes over 3ms.

AFAICT the processes are normally woken on the same cpu they last ran on.
There seems to be a problem when the selected cpu is running a (low priority)
process that is looping in kernel [1].
I'd expect my process to be picked up by one of the idle cpus, but this
doesn't happen.
Instead the process sits in state 'waiting' until the active processes sleeps
(or calls cond_resched()).

Is this really the expected behaviour?????

This is  5.4.0-rc7 kernel.
I could try the current 5.5-rc one if any recent changes might affect things.

Additionally (probably because cv_wait() is implemented with 'ticket locks')
none of the other processes waiting for the cv are woken either.

[1] Xorg seems to periodically request the kernel workqueue run
drm_cflush_sg() to flush the display buffer cache.
For a 2560x1140 display this is 3600 4k pages and the flush loop
takes ~3.3ms.
However there are probably other places where a process can run in
kernel for significant lengths of time.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ