linux-kernel - Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 23 Jul 2011 00:23:36 +0400
From:	Kirill Smelkov <kirr@....spb.ru>
To:	Keith Packard <keithp@...thp.com>
Cc:	Pekka Enberg <penberg@...nel.org>,
	Chris Wilson <chris@...is-wilson.co.uk>,
	Luke-Jr <luke@...hjr.org>, intel-gfx@...ts.freedesktop.org,
	LKML <linux-kernel@...r.kernel.org>,
	dri-devel@...ts.freedesktop.org, "Rafael J. Wysocki" <rjw@...k.pl>,
	Ray Lee <ray-lk@...rabbit.org>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Florian Mickler <florian@...kler.org>
Subject: Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?

Keith,

first of all thanks for your prompt reply. Then...

On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@....spb.ru> wrote:
> 
> > And now after v3.0 is out, I've tested it again, and yes, like it was
> > broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
> > bad io access the system freezes completely:
> 
> I looked at this when I first saw it (a couple of weeks ago), and I
> couldn't see any obvious reason this patch would cause this particular
> problem. I didn't want to revert the patch at that point as I feared it
> would cause other subtle problems. Given that you've got a work-around,
> it seemed best to just push this off past 3.0.

What kind of a workaround are you talking about? Sorry, to me it all
looked like "UMS is being ignored forever". Anyway, let's move on to try
to solve the issue.


> Given the failing address passed to ioread32, this seems like it's
> probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is 0x21,
> which is an offset in 32-bit units within the hardware status page. If
> the status_page.page_addr value was zero, then the computed address
> would end up being 0x84.
> 
> And, it looks like status_page.page_addr *will* end up being zero as a
> result of the patch in question. The patch resets the entire ring
> structure contents back to the initial values, which includes smashing
> the status_page structure to zero, clearing the value of
> status_page.page_addr set in i915_init_phys_hws.
> 
> Here's an untested patch which moves the initialization of
> status_page.page_addr into intel_render_ring_init_dri. I note that
> intel_init_render_ring_buffer *already* has the setting of the
> status_page.page_addr value, and so I've removed the setting of
> status_page.page_addr from i915_init_phys_hws.
> 
> I suspect we could remove the memset from intel_init_render_ring_buffer;
> it seems entirely superfluous given the memset in i915_init_phys_hws.
> 
> From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00 2001
> From: Keith Packard <keithp@...thp.com>
> Date: Fri, 22 Jul 2011 10:44:39 -0700
> Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
>  intel_render_ring_init_dri
> 
> Physically-addressed hardware status pages are initialized early in
> the driver load process by i915_init_phys_hws. For UMS environments,
> the ring structure is not initialized until the X server starts. At
> that point, the entire ring structure is re-initialized with all new
> values. Any values set in the ring structure (including
> ring->status_page.page_addr) will be lost when the ring is
> re-initialized.
> 
> This patch moves the initialization of the status_page.page_addr value
> to intel_render_ring_init_dri.
> 
> Signed-off-by: Keith Packard <keithp@...thp.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
>  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1271282..8a3942c 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device *dev)
>  static int i915_init_phys_hws(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
>  
>  	/* Program Hardware Status Page */
>  	dev_priv->status_page_dmah =
> @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device *dev)
>  		DRM_ERROR("Can not allocate hardware status page\n");
>  		return -ENOMEM;
>  	}
> -	ring->status_page.page_addr =
> -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
>  
> -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> +	memset_io((void __force __iomem *)dev_priv->status_page_dmah->vaddr,
> +		  0, PAGE_SIZE);
>  
>  	i915_write_hws_pga(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e961568..47b9b27 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  		ring->get_seqno = pc_render_get_seqno;
>  	}
>  
> +	if (!I915_NEED_GFX_HWS(dev))
> +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> +
>  	ring->dev = dev;
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);

I can't tell whether this is correct, because intel gfx driver is
unknown to me, but from the first glance your description sounds reasonable.

I'm out of office till ~ next week's tuesday, and on return I'll try
to test it on the hardware in question.


Thanks again,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/