lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 30 Mar 2010 16:49:16 +1000
From:	Dave Airlie <airlied@...il.com>
To:	Michel Dänzer <michel@...nzer.net>
Cc:	Dave Airlie <airlied@...ux.ie>, torvalds@...ux-foundation.org,
	linux-kernel@...r.kernel.org, dri-devel@...ts.sf.net,
	Jerome Glisse <glisse@...edesktop.org>
Subject: Re: [git pull] drm fixes

2010/3/30 Michel Dänzer <michel@...nzer.net>:
> On Tue, 2010-03-30 at 05:34 +0100, Dave Airlie wrote:
>>
>> Original pull req below + reverts the fallback placement change which had
>> a side effect of causing more lockups on some AGP systems (this is a bug in
>> the AGP drivers that needs to be tracked down), [...]
>
> While I was able to work around the lockups by making the AGP driver
> never unbind a GTT entry, I think it's rather a radeon issue - how is
> the AGP driver supposed to know when it's safe to unbind an entry?

This issue has been a problem with AGP before, the Intel AGP docs claim
you should always use scratch pages on AGP, and never complete remove
bound entries. I've no idea why this is, as you'd expect AGP cards to
only generate
cycles to entries they've been asked to. There may be some memory controller
prefetching going on that could lead to prefetching into an unbound AGP page
and the resulting machine check that may cause I suppose.

We need to track this separately anyways and fix it for 2.6.35 hopefully, at
least we have a patch that can handle it.

> That change had lots of other issues anyway, thanks for reverting it.
>
>
>> [...] and I've merged Jerome's GPU recovery code, as I'd much rather
>> users had some of hope of recovering from their GPU locking up than a
>> dead box. It seems to work for quite a lot of people that have tested
>> it, and it won't make a GPU lockup problem worse.
>
> Unfortunately, that's not true in all cases. The change itself mentions
> that the new reset code is unreliable for R3xx generation GPUs, and
> indeed with my RV350 it now turns my box into a brick immediately on a
> GPU lockup most of the time whereas previously it was usually able to
> recover at least in some cases, e.g. falling back to PCI mode after
> trying to use a non-working AGP transfer mode.
>

Okay so it makes it worse, hopefully Jerome can track it down, or
else we can lock down the gpu reset to only trying on the r600s where
it definitely makes life a lot better for everyone.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ