lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTi=s9ZMdUe2bsK2OA5rJBej=R8CaJyE+kT3Fd5wG@mail.gmail.com>
Date:	Wed, 19 Jan 2011 22:22:48 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jeff Chua <jeff.chua.linux@...il.com>
Cc:	Len Brown <len.brown@...el.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
	Chris Wilson <chris@...is-wilson.co.uk>,
	Jesse Barnes <jbarnes@...tuousgeek.org>,
	Dave Airlie <airlied@...ux.ie>, linux-kernel@...r.kernel.org,
	DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: more intel drm issues (was Re: [git pull] drm intel only fixes)

On Wed, Jan 19, 2011 at 8:55 PM, Jeff Chua <jeff.chua.linux@...il.com> wrote:
>
> Rafael send out two patches earlier. Could be related. I was facing
> issue during resume.

No, I'm aware of the rcu-synchronize thing, this isn't it. This is
really at the suspend stage, and I had bisected it down to the drm
changes.

In fact, by now I have bisected it down to a single commit. It's
another merge commit, which makes me a bit nervous (I bisected another
issue today, and it turned out to simply not be repeatable).

But this time the merge commit actually has a real conflict that got
fixed up in the merge, and the code around the conflict waits for
three seconds, and three seconds is also exactly how long the delay at
suspend time is. So I get the feeling that this time it's a real
issue, and what happened was that the merge may have been a mismerge.

Chris: as of commit 8d5203ca6253 ("Merge branch 'drm-intel-fixes' into
drm-intel-next") I'm getting that 3-second delay at suspend time. And
the merge diff looks like this:

 +	struct drm_device *dev = ring->dev;
 +	struct drm_i915_private *dev_priv = dev->dev_private;
  	unsigned long end;
 -	drm_i915_private_t *dev_priv = dev->dev_private;
  	u32 head;

- 	head = intel_read_status_page(ring, 4);
- 	if (head) {
- 		ring->head = head & HEAD_ADDR;
- 		ring->space = ring->head - (ring->tail + 8);
- 		if (ring->space < 0)
- 			ring->space += ring->size;
- 		if (ring->space >= n)
- 			return 0;
- 	}
-
  	trace_i915_ring_wait_begin (dev);
  	end = jiffies + 3 * HZ;
  	do {

and that whole do-loop with a 3-second timeout makes me *very*
suspicious. It used to have (in _one_ of the parent branches) that
code before it to return early if there was space in the ring, now it
doesn't any more - and that merge co-incides with my suspend suddenly
taking 3 seconds.

The same check that is deleted does exist inside the loop too, but
there it has some extra code it in (compare to "actual_head" and so
on), so I wonder if the fast-case was somehow hiding this issue.

But I don't know the code. I just see that whole "PM: suspend of
devices complete after x.xxx msecs" issue, and I can see the machine
taking too long to suspend.

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ