linux-kernel - [BUG] completely bonkers use of set_need_resched + VM_FAULT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130912150645.GZ31370@twins.programming.kicks-ass.net>
Date:	Thu, 12 Sep 2013 17:06:45 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	airlied@...ux.ie
Cc:	Daniel Vetter <daniel.vetter@...ll.ch>,
	Maarten Lankhorst <maarten.lankhorst@...onical.com>,
	Thomas Hellstrom <thellstrom@...are.com>,
	intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

Hi Dave,

So I'm poking around the preemption code and stumbled upon:

drivers/gpu/drm/i915/i915_gem.c:                set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
drivers/gpu/drm/ttm/ttm_bo_vm.c:                        set_need_resched();
drivers/gpu/drm/udl/udl_gem.c:          set_need_resched();

All these sites basically do:

  while (!trylock())
  	yield();

which is a horrible and broken locking pattern. 

Firstly its deadlock prone, suppose the faulting process is a FIFOn+1
task that preempted the lock holder at FIFOn.

Secondly the implementation is worse than usual by abusing
VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault
doesn't retry, but you're using it as a get out of fault path. And
you're using set_need_resched() which is not something a driver should
_ever_ touch.

Now I'm going to take away set_need_resched() -- and while you can
'reimplement' it using set_thread_flag() you're not going to do that
because it will be broken due to changes to the preempt code.

So please as to fix ASAP and don't allow anybody to trick you into
merging silly things like that again ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/