lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110413192424.GJ19819@8bytes.org>
Date:	Wed, 13 Apr 2011 21:24:24 +0200
From:	Joerg Roedel <joro@...tes.org>
To:	"H. Peter Anvin" <hpa@...or.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Yinghai Lu <yinghai@...nel.org>,
	Alex Deucher <alexdeucher@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	dri-devel@...ts.freedesktop.org,
	Thomas Gleixner <tglx@...utronix.de>, Tejun Heo <tj@...nel.org>
Subject: Re: Linux 2.6.39-rc3

On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote:
> On 04/13/2011 10:21 AM, Joerg Roedel wrote:
> > 
> > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where
> > only a couple of patches and merged v2.6.38-rc4 in at every step. There
> > was no failure found.
> > Then I tried this again, but this time I merged v2.6.38-rc5 at every
> > step and was successful. The bad commit in this branch turned out to be
> > 
> > 	1a4a678b12c84db9ae5dce424e0e97f0559bb57c
> > 
> > which is related to memblock.
> > 
> > Then I tried to find out which change between 2.6.38-rc4 and 2.6.38-rc5
> > is needed to trigger the failure, so I used f005fe12b90c as a base,
> > bisected between v2.6.38-rc4..v2.6.38-rc5 and merged every bisect step
> > into the base and tested. Here the bad commit turned out to be
> > 
> > 	e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20
> > 
> > which is related to gart. It turned out that the gart aperture on that
> > box is on another position with these patches. Before it was as
> > 0xa4000000 and now it is at 0xa0000000. It seems like this has something
> > to do with the root-cause.
> > 
> > Reverting commit 1a4a678b12c84db9ae5dce424e0e97f0559bb57c fixes the
> > problem btw. and booting with iommu=soft also works, but I have no idea
> > yet why the aperture at that address is a problem (with the patch
> > reverted the aperture lands at 0x80000000).
> > 
> 
> Does reverting e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 solve the
> problem for you?

No, reverting that patch doesn't make the problem go away (and the gart
aperture is still on 0xa0000000). I tested this in 39-rc3, I havn't
tested if it makes a difference on the original bisect-commit from Ingo,
probably it does (don't know if that matters).
Strange about this commit is that it fixes an x86 gart aperture
allocation bug in generic memblock code.

> 1a4a678b12c84db9ae5dce424e0e97f0559bb57c is a memory-allocation-order
> patch, which have a nasty tendency to unmask bugs elsewhere in the
> kernel.  However, e6d2e2b2b1e1455df16d68a78f4a3874c7b3ad20 looks
> positively strange (and it doesn't exactly help that the description is
> written in Yinghai-ese and is therefore nearly impossible to decode,
> never mind tell if it is remotely correct.)

I think that the two commits are okay and the bug is somewhere else, but
I have no idea yet were to look next. I spent some time looking at
radeon code and talking to Alex about it (because it seemed suspicous
that the GTT is on 0xa0000000 too, but as Alex explained me this is an
address in the GPU address space and shouldn't matter).

Regards,

       Joerg	

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ