lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 23 Jan 2013 13:40:39 +0000
From:	"Deucher, Alexander" <Alexander.Deucher@....com>
To:	Shuah Khan <shuahkhan@...il.com>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: RE: Linux 3.8-rc4

> -----Original Message-----
> From: Shuah Khan [mailto:shuahkhan@...il.com]
> Sent: Tuesday, January 22, 2013 6:57 PM
> To: Deucher, Alexander
> Cc: Linus Torvalds; Linux Kernel Mailing List
> Subject: Re: Linux 3.8-rc4
> 
> On Tue, Jan 22, 2013 at 11:55 AM, Shuah Khan <shuahkhan@...il.com>
> wrote:
> 
> >>> init:
> >>
> >> Does the attached patch stop them?  It basically skips all initialization of
> the DMA ring on your system.  What I don't understand is why you still get
> them with the previous patch, but not with
> 909d9eb67f1e4e39f2ea88e96bde03d560cde3eb reverted.
> 909d9eb67f1e4e39f2ea88e96bde03d560cde3eb only affects the use of the
> DMA ring for buffer migration and the patch I previously attached disables
> the use of the DMA ring for buffer migration.  Does the latest batch of drm-
> fixes from Dave that Linus just merged help?
> >>
> >> Alex
> >
> > Will try your latest patch. Will also try the latest git - I am
> > currently on Jan 17th. However, in the meantime, I found that these
> > messages might not be new and getting printed now with the
> > eaaa6983ab2ccdf826c90838eb584211e0cadb76 [PATCH] drm/radeon: print
> dma
> > status reg on lockup (v2) commit that introduced debug messages in
> > r600_gpu_soft_reset(). I couldn't revert this commit, but doing a
> > compile with these messages commented out. Will update you on the
> > results and then test the new git
> >
> > -- Shuah
> 
> Here is what I tried:
> 
> 1. Applied your latest disable_dma_ring_on_6xx-2.diff and still see
> messages.

If that is the case, I'm beginning to think the bug is elsewhere.  Support for the DMA ring was the only major feature we added in this cycle.  If you are still getting errors even with the ring completely disabled, it's probably not the DMA ring.

Make sure your kernel has this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=20707874fd4fd37e09513f508e642fa8bd06365a
That's the only thing I can think of that may cause the DMAR errors if the DMA ring is disabled.

> 2. Tried intel_iommu=igfx_off to see if that changes anything. The
> reason for trying this option is, I noticed this message: (this is not
> a new message, I see this all the time)
> 
> [    1.337112] DMAR: Disabling IOMMU for graphics on this chipset
> 
> No change with or without option - still see the same messages.
> 
> Next steps:
> 
> 1. One big difference between 3.7 and 3.8 is in the
> r600_gpu_soft_reset() - I started with 3.7 to see the differences if
> any of these differences is causing this to be logged. In 3.7
> r600_gpu_soft_reset() is called with no reset_mask. I am going to
> first verify if softreset happens on 3.7. Does this give you any ideas
> of whether this could cause a problem?

I don't think the problem is related to GPU reset.  That's for resetting the GPU when it hangs.  It changed slightly in 3.8 to accommodate the new DMA engine that we added support for in 3.8.  Previously we just reset the graphics engine.

> 
> 2. Another angle I am looking at is the newly added
> 
> dev_info(rdev->dev, "  R_00D034_DMA_STATUS_REG   = 0x%08X\n",
>                 RREG32(DMA_STATUS_REG));
> 
> messages in r600_gpu_soft_reset_dma().
> 
> Could it be that these newly added debug messages are now showing this
> old condition that always existed on my test system. From what I have
> observed so far, this is very likely.

I'm not following.  That just prints the DMA status register when we attempt to reset the GPU.  It's purely for debugging.

Alex

> 
> Please let me know if you want to me try anything else or if you don't
> think these steps won't help.
> 
> -- Shuah


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ