lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101109103737.GA1767@arch.trippelsdorf.de>
Date:	Tue, 9 Nov 2010 11:37:37 +0100
From:	Markus Trippelsdorf <markus@...ppelsdorf.de>
To:	Michel Dänzer <michel@...nzer.net>
Cc:	Thomas Hellstrom <thellstrom@...are.com>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Radeon RS780 - BUG: unable to handle kernel NULL pointer
 dereference

On Tue, Nov 09, 2010 at 11:32:57AM +0100, Michel Dänzer wrote:
> On Die, 2010-11-09 at 11:07 +0100, Thomas Hellstrom wrote: 
> > On 11/09/2010 10:53 AM, Thomas Hellstrom wrote:
> > > On 11/09/2010 10:29 AM, Markus Trippelsdorf wrote:
> > >> OK I've found the buggy commit by bisection:
> > >>
> > >> e376573f7267390f4e1bdc552564b6fb913bce76 is the first bad commit
> > >> commit e376573f7267390f4e1bdc552564b6fb913bce76
> > >> Author: Michel Dänzer<daenzer@...are.com>
> > >> Date:   Thu Jul 8 12:43:28 2010 +1000
> > >>
> > >>      drm/radeon: fall back to GTT if bo creation/validation in VRAM 
> > >> fails.
> > >>
> > >>      This fixes a problem where on low VRAM cards we'd run out of 
> > >> space for validation.
> > >>
> > >>      [airlied: Tested on my M7, Thinkpad T42, compiz works with no 
> > >> problems.]
> > >>
> > >>      Signed-off-by: Michel Dänzer<daenzer@...are.com>
> > >>      Cc: stable@...nel.org
> > >>      Signed-off-by: Dave Airlie<airlied@...hat.com>
> > >>
> > >> Please note that this is an old commit from 2.6.36-rc. When I revert 
> > >> it the
> > >> kernel no longer crashes. Instead I see the following in my dmesg:
> > >>
> > >
> > > Hmm, so this sounds like something in the Radeon eviction error path 
> > > is causing corruption.
> > > I had a similar problem with vmwgfx, when I tried to unref a BO 
> > > _after_ ttm_bo_init() failed.
> > > ttm_bo_init() is really supposed to call unref itself for various 
> > > reasons,  so calling unref() or kfree() after a failed ttm_bo_init() 
> > > will cause corruption.
> > >
> > > In any case, the error below also suggests something is a bit fragile 
> > > in the Radeon driver:
> > >
> > > First, an accelerated eviction may fail, like in the message below, 
> > > but then there must always be a backup plan, like unaccelerated 
> > > eviction to system. On BO creation, there are a number of placement 
> > > strategies, but if all else fails, it should be possible to initially 
> > > place the BO in system memory.
> > >
> > > Second, If bo validation fails during a command submission, due to 
> > > insufficient VRAM / TT, then the driver should retry the complete 
> > > validation cycle after first blocking all other validators and then 
> > > evicting everything not pinned, to avoid failures due to fragmentation.
> > >
> > > /Thomas
> > >
> > 
> > Indeed, it seems like the commit you mention just retries ttm_bo_init() 
> > after it previously failed. At that point the bo has been destroyed, so 
> > that is probably what's causing the BUG you are seeing.
> > 
> > Admittedly, ttm_bo_init() calling unref on failure is not properly 
> > documented in the function description.  The reason for doing so is to 
> > have a single path for freeing all BO resources already allocated on the 
> > point of failure.
> 
> Does the patch below fix the problem?

Yes, indeed. I was just about to send the same patch to the list.

Thanks.
-- 
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ