lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 12 Sep 2013 17:14:20 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Maarten Lankhorst <maarten.lankhorst@...onical.com>
Cc:	airlied@...ux.ie, Daniel Vetter <daniel.vetter@...ll.ch>,
	Thomas Hellstrom <thellstrom@...are.com>,
	intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [BUG] completely bonkers use of set_need_resched +
 VM_FAULT_NOPAGE

On Thu, Sep 12, 2013 at 05:11:25PM +0200, Maarten Lankhorst wrote:
> Op 12-09-13 17:06, Peter Zijlstra schreef:
> > Hi Dave,
> >
> > So I'm poking around the preemption code and stumbled upon:
> >
> > drivers/gpu/drm/i915/i915_gem.c:                set_need_resched();
> > drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
> > drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
> > drivers/gpu/drm/udl/udl_gem.c:          set_need_resched();
> >
> > All these sites basically do:
> >
> >   while (!trylock())
> >   	yield();
> >
> > which is a horrible and broken locking pattern. 
> >
> > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
> > task that preempted the lock holder at FIFOn.
> >
> > Secondly the implementation is worse than usual by abusing
> > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
> > doesn't retry, but you're using it as a get out of fault path. And
> > you're using set_need_resched() which is not something a driver should
> > _ever_ touch.
> >
> > Now I'm going to take away set_need_resched() -- and while you can
> > 'reimplement' it using set_thread_flag() you're not going to do that
> > because it will be broken due to changes to the preempt code.
> >
> > So please as to fix ASAP and don't allow anybody to trick you into
> > merging silly things like that again ;-)
> >
> Agreed, but this is a case of locking inversion. How do you propose to get around that?

me? No idea, I've never looked at the actual locking in DRM. Someone
who's familiar with that code would have to tackle that.

I just spotted the fail because I was going to remove set_need_resched()
and had a WTF moment when I tripped over this stuff.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ