linux-kernel - Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1235095484.2636.39.camel@gaiman>
Date:	Thu, 19 Feb 2009 18:04:44 -0800
From:	Eric Anholt <eric@...olt.net>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Thomas Hellstrom <thomas@...pmail.org>,
	Wang Chen <wangchen@...fujitsu.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Ingo Molnar <mingo@...e.hu>, dri-devel@...ts.sourceforge.net,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm: Fix lock order reversal between mmap_sem and
	struct_mutex.

On Thu, 2009-02-19 at 23:26 +0100, Peter Zijlstra wrote:
> On Thu, 2009-02-19 at 22:02 +0100, Thomas Hellstrom wrote:
> >  
> > It looks to me like the driver preferred locking order is
> > 
> > object_mutex (which happens to be the device global struct_mutex)
> >   mmap_sem
> >      offset_mutex.
> > 
> > So if one could avoid using the struct_mutex for object bookkeeping (A 
> > separate lock) then
> > vm_open() and vm_close() would adhere to that locking order as well, 
> > simply by not taking the struct_mutex at all.
> > 
> > So only fault() remains, in which that locking order is reversed. 
> > Personally I think the trylock ->reschedule->retry method with proper 
> > commenting is a good solution. It will be the _only_ place where locking 
> > order is reversed and it is done in a deadlock-safe manner. Note that 
> > fault() doesn't really fail, but requests a retry from user-space with 
> > rescheduling to give the process holding the struct_mutex time to 
> > release it.
> 
> It doesn't do the reschedule -- need_resched() will check if the current
> task was marked to be scheduled away, furthermore yield based locking
> sucks chunks.
> 
> What's so very difficult about pulling the copy_*_user() out from under
> the locks?

That we're expecting the data movement to occur while holding device
state in place.  For example, we write data through the GTT most of the
time so we:

lock struct_mutex
pin the object to the GTT
flushing caches as needed
copy_from_user
unpin object
unlock struct_mutex

If I'm to pull the copy_from_user out, that means I have to:

alloc temporary storage
for each block of temp storage size:
	copy_from_user
	lock struct_mutex
	pin the object to the GTT
	flush caches as needed
	memcpy
	unpin object
	unlock struct_mutex

At this point of introducing our third copy of the user's data in our
hottest path, we should probably ditch the pwrite path entirely and go
to user mapping of the objects for performance.  Requiring user mapping
(which has significant overhead) cuts the likelihood of moving from
user-space object caching to kernel object caching in the future, which
has the potential of saving steaming piles of memory.

-- 
Eric Anholt
eric@...olt.net                         eric.anholt@...el.com



Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)