lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Feb 2012 09:12:19 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	stephan.baerwolf@...ilmenau.de
Cc:	linux-kernel@...r.kernel.org, Michael Buesch <mb@...sch.de>,
	Chris Wilson <chris@...is-wilson.co.uk>,
	Alex Deucher <alexdeucher@...il.com>,
	Dave Airlie <airlied@...hat.com>
Subject: Re: responsiveness: newer kernels causing lagging and blocking

On Thu, Feb 23, 2012 at 8:30 AM, Stephan Bärwolf
<stephan.baerwolf@...ilmenau.de> wrote:
> Under various conditions linux since 2.6.39-rc1 laggs and blocks enormously the whole system.
> (For example while starting "winecfg" (on a thinkpad X220) and parallel moving the
> mousecursor you can observe a periodic blocking for some seconds)
>
> After bisecting a little while, commit "4819d2e4310796c4e9eef674499af9b9caf36b5a"
> (" drm: Retry i2c transfer of EDID block after failure ") seems to be responsible.
>
> Because function "drm_do_probe_ddc_edid" loops trying "i2c_transfer" it consumes a lot
> of time during errors. Reverting or changing "retries" from 5 to 1 extremly minimizes the
> problem to "not perceptible".
> It seems the locking within "i2c_transfer" slows everything down.
> So maybe it is possible to yield() before calling it?

Ugh. The whole i2c thing is a mess. Most of the i2c drivers seem to
busy-loop using 'usleep()' too, so not only do they take a long time,
they take a long time while using CPU and being unresponsive in
general. So it's not just locking, I suspect.

I'm not surprised that it gets to the point that you can notice the
unresponsiveness. At some point I made a bug-report about the i915
driver using 7% of CPU (yes, really) just doing i2c all the time on an
Apple Mac Mini, because it just couldn't get happy with the results,
and the i915 driver would re-start it every five seconds or something
like that.

Your case sounds even worse - at least on that Mac Mini it didn't
cause all that noticeable hickups (possibly because it was
multi-core).

And looking at the code - not only does drm_do_probe_ddc_edid() have a
retry loop, the *callers* sometimes call that thing from two loops
deep. There's the block count, and for some reason there's that "i =
0..3" retry loop around it in drm_do_get_edid() that seems to be
*another* retry loop.

So if I read it right, drm_do_get_edid() actually retries *20* times
for the base block, and then potentially does that for each block.

I wonder if that double retry is really intentional at all.

But yeah, if nothing else, let's *please* add a

    if (need_resched()) schedule();

at the top of drm_do_probe_ddc_edid().

I'm assuming you don't have PREEMPT enabled? Does that hide the problem?

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists