linux-kernel - Re: [PATCH] drm/nouveau/gem: tolerate a buffer specified multiple times

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKb7UvgyQ_6V2_xH2sxaJ5daDwXZNJ-5KAPTbtSaxHRQA1rABQ@mail.gmail.com>
Date:	Fri, 31 Jul 2015 12:36:29 -0400
From:	Ilia Mirkin <imirkin@...m.mit.edu>
To:	"Bryan O'Donoghue" <pure.logic@...us-software.ie>
Cc:	Peter Hurley <peter@...leysoftware.com>,
	Timo Aaltonen <tjaalton@...ian.org>,
	Emil Velikov <emil.l.velikov@...il.com>,
	Maarten Lankhorst <maarten.lankhorst@...onical.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	Ben Skeggs <bskeggs@...hat.com>
Subject: Re: [PATCH] drm/nouveau/gem: tolerate a buffer specified multiple times

On Fri, Jul 31, 2015 at 6:27 AM, Bryan O'Donoghue
<pure.logic@...us-software.ie> wrote:
> ah no... 2.4.60 is right...
>
> Yes so Ilia - I've switched out 2.4.60 as per your suggestion to 2.4.56
> (getting the version numbers right :) ) and it's still definitely giving me
> the multiple instances message.

This is going to sound like a stupid question, but I'll ask anyways --
you *did* restart chrome after changing libdrm versions, right?

I was going to mention that there were a handful of fixes in libdrm,
potentially since 2.4.56 (I forget the exact versions), but if 2.4.60
also fails, then that would have them.

There was a final assert() added in 2.4.62, but that was to better
isolate the cause of weirdo crashes (i.e. crash when the thing going
wrong happens rather than stashing bad pointers for later very
confusing dereference). Not GPU crashes.

Just for your information,

nouveau E[   PFIFO][0000:01:00.0] PFIFO: read fault at
0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum
0x00000000)/GPC0/(unknown enum 0x0000000f) on channel 0x007f80c000
[unknown]

means that there was VM fault from an unknown gpu unit (???) when
reading some resource by the GPU. (The GPU has its own MMU.)
Unfortunately this can happen for one of a million reasons, the
biggest one being "unknown", but mesa definitely doesn't handle
command submission failures particularly well... should probably add a
"fail 1% of the time" thing to help fix that up.

Do you have a reproducible way of achieving the multiple buffer on
validation list thing? What GPU do you have? (Looking for a codename,
not a marketing name... lspci should have it... GFxxx or GKxxx or
Gxx.)

  -ilia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/