lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 Nov 2015 15:52:02 -0700
From:	Jens Axboe <axboe@...nel.dk>
To:	Chris Wilson <chris@...is-wilson.co.uk>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	DRI Development <dri-devel@...ts.freedesktop.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: __i915_spin_request() sucks

On 11/12/2015 03:19 PM, Chris Wilson wrote:
>>> So today, I figured I'd try just killing that spin. If it fails, we'll
>>> punt to normal completions, so easy change. And wow, MASSIVE difference.
>>> I can now scroll in chrome and not rage! It's like the laptop is 10x
>>> faster now.
>>>
>>> Ran git blame, and found:
>>>
>>> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
>>> Author: Chris Wilson <chris@...is-wilson.co.uk>
>>> Date:   Tue Apr 7 16:20:41 2015 +0100
>>>
>>>      drm/i915: Optimistically spin for the request completion
>>>
>>> and read the commit message. Doesn't sound that impressive. Especially
>>> not for something that screws up interactive performance by a LOT.
>>>
>>> What's the deal? Revert?
>
> The tests that it improved the most were the latency sensitive tests and
> since my Broadwell xps13 behaves itself, I'd like to understand how it
> culminates in an interactivity loss.
>
> 1. Maybe it is the uninterruptible nature of the polling, making X's
> SIGIO jerky:

This one still feels bad.

> 2. Or maybe it is increased mutex contention:

And so does this one... I had to manually apply hunks 2-3, and after 
doing seat-of-the-pants testing for both variants, I confirmed with perf 
that we're still seeing a ton of time in __i915_wait_request() for both 
of them.

> Or maybe it is an indirect effect, such as power balancing between the
> CPU and GPU, or just thermal throttling, or it may be the task being
> penalised for consuming its timeslice (for which any completion polling
> seems susceptible).

Look, polls in the 1-10ms range are just insane. Either you botched the 
commit message and really meant "~1ms at most" and in which case I'd 
suspect you of smoking something good, or you hacked it up wrong and 
used jiffies when you really wanted to be using some other time check 
that really did give you 1us.

I'll take an IRQ over 10 msecs of busy looping on my laptop, thanks.

>> "Limit the spinning to a single jiffie (~1us) at most"
>>
>> is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
>> Even if I had HZ=1000, that'd still be 1ms of spinning. That's
>> seriously screwed up, guys.
>
> That's over and above the termination condition for blk_poll().

?! And this is related, how? Comparing apples and oranges. One is a test 
opt-in feature for experimentation, the other is unconditionally enabled 
for everyone. I believe the commit even says so. See the difference? 
Would I use busy loop spinning waiting for rotating storage completions, 
which are in the 1-10ms range? No, with the reason being that the 
potential wins for spins are in the usec range.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ