lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.02.1403241302260.15351@file01.intranet.prod.int.rdu2.redhat.com>
Date:	Mon, 24 Mar 2014 13:17:12 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Daniel Vetter <daniel@...ll.ch>
cc:	Dave Airlie <airlied@...il.com>, Andreas Mohr <andi@...as.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	DRI <dri-devel@...ts.freedesktop.org>
Subject: Re: 3.14-rc7 crashes in drm ([PATCH] a crash in
 mga_driver_irq_uninstall)



On Mon, 24 Mar 2014, Daniel Vetter wrote:

> On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote:
> > On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr <andi@...as.de> wrote:
> > > On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote:
> > >> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr <andi@...as.de> wrote:
> > >> >
> > >> > which did end up flawless on 3.12.0-rc2+, too
> > >> > (but failed to improve the issue on 3.14.0-rc7+).
> > >> >
> > >> > So, for all intents and purposes, drm infrastructure seems unavoidably
> > >> > (neither dri disable nor libdrm upgrade helps) affected.
> > >> > Does anyone know which change caused that issue?
> > >> > (I'm asking because bisect here would be relatively painful).
> > >>
> > >> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current
> > >> 3.14 rc only, or did it happen already in the previous release?
> > >
> > > Hmm, given that Mikulas in
> > > https://lkml.org/lkml/2014/2/26/537
> > > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!)
> > > that that indeed may have been a regression at <= 3.13 proper even
> > > (which may pose interesting questions about the level of testing coverage
> > > we still enjoy [not!?] in this hardware area).

That patch drops a mutex, so it is not correct. There is mutex resursion - 
we need to uninstall the irq in drm_master_destroy, because here we are 
committed to destroy the device. But the routine that uninstalls the irq 
takes struct_mutex, which is already held in drm_master_destroy.

I suppose that the person who maintains drm reworks the patch so that it's 
correct:

- could we use a different mutex to protect the irq in drm_irq.c? Or 
possibly no mutex at all and use cmpxchg to manipulate the variable 
dev->irq_enabled? - this seems like the best solution. But I am not sure 
if the code in drm_irq.c somehow depends on the facts that other parts of 
the drm subsystem take struct_mutex.

- could we pass a new argument to drm_irq_uninstall that tells it not to 
take the mutex? drm_master_destroy would set this argument to 1. 
drm_master_destroy is mostly called with struct_mutex held, but there may 
be places in vmwgfx_drv.c where drm_master_put (which calls 
drm_master_destroy) may be called without struct_mutex held.

Is it true that drm_master_destroy can be called without struct_mutex 
held? I don't know.


I think drm maintainer should sort out the above issues and modify the 
patch accordingly.

> > > Oh well, seems I'll have to prepare/build 3.13 now...
> > 
> > It's > 15 year old hardware, so yes I believe we have close to 0
> > testing coverage on it outside of distros,
> > 
> > I'm not even sure I have one anymore, I might be able to test an MGA in one box.
> 
> I haven't done a full read of all the related code, but this smells like a
> similar bug I've hit all over the place in the i810 driver (another one of
> those undead drm drivers of yonders). Ingredients:
> 
> 1) Xorg creates a drm mapping of the register space.
> 2) Xorg tells the hw-specific drm which drm mapping has the hw registers,
> and the driver uses that. Iirc this has been done as some form of OS
> abstraction. Also note that these mappings aren't refcounted, so the first
> guy to call drm_rmmap wins.
> 
> -> All hell breaks loose if Xorg dies and takes all it's mappings with it
> (in master_destroy, since the Xorg /dev fd is the master) and leaves the
> driver hanging in the air if there's an interrupt still pending (or
> anything else fwiw).

For me that crash happened when xorg exited with a fatal error too.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ