lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <9451c48fd66b4df0a5ede5391c4e64ef@AcuMS.aculab.com>
Date:   Mon, 13 Jan 2020 17:39:35 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     David Laight <David.Laight@...LAB.COM>,
        "'maarten.lankhorst@...ux.intel.com'" 
        <maarten.lankhorst@...ux.intel.com>,
        "'mripard@...nel.org'" <mripard@...nel.org>,
        "'sean@...rly.run'" <sean@...rly.run>,
        "'airlied@...ux.ie'" <airlied@...ux.ie>,
        "'daniel@...ll.ch'" <daniel@...ll.ch>,
        "'dri-devel@...ts.freedesktop.org'" <dri-devel@...ts.freedesktop.org>,
        "'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>
Subject: RE: drm_cflush_sg() loops for over 3ms - scheduler not running tasks.

From: David Laight
> Sent: 13 January 2020 14:35
> 
> I've been looking at why some RT processes don't get scheduled promptly.
> In my test the RT process's affinity ties it to a single cpu (this may not be such
> a good idea as it seems).
> 
> What I've found is that the Intel i915 graphics driver uses the 'events_unbound'
> kernel worker thread to periodically execute drm_cflush_sg().
> (see https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/drm_cache.c)
...
> This loop takes about 1us per iteration split fairly evenly between whatever is in
> for_each_sg_page() and drm_cflush_page().
> With a 2560x1440 display the loop count is 3600 (4 bytes/pixel) and the whole
> function takes around 3.3ms.

Actually not setting the cpu affinity makes no difference.
The process is woken up on the cpu it last ran on and sits 'waiting' until
drm_cflush_sg() finishes - even though the other cpu become idle.
No sign of sched_migrate_task event 'stealing' the process.

Even worse, because 'ticket locks' are used no other user processes can
acquire the same (user) mutex or be woken from cv_wait() until the
process actually runs.

This is a 5.4.0-rc7 kernel.
I think I saw some recent scheduler patches, I can try them until I can't build
with gcc 4.7.3 :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ