linux-kernel - Re: [PATCH v2 2/3] mutex: add support for reservation style locks, v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130408115026.GL2228@phenom.ffwll.local>
Date:	Mon, 8 Apr 2013 13:50:26 +0200
From:	Daniel Vetter <daniel@...ll.ch>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Daniel Vetter <daniel.vetter@...ll.ch>,
	Maarten Lankhorst <maarten.lankhorst@...onical.com>,
	linux-arch@...r.kernel.org, x86@...nel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	dri-devel <dri-devel@...ts.freedesktop.org>,
	"linaro-mm-sig@...ts.linaro.org" <linaro-mm-sig@...ts.linaro.org>,
	rob clark <robclark@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>,
	"linux-media@...r.kernel.org" <linux-media@...r.kernel.org>
Subject: Re: [PATCH v2 2/3] mutex: add support for reservation style locks, v2

On Mon, Apr 08, 2013 at 12:39:24PM +0200, Peter Zijlstra wrote:
> On Thu, 2013-04-04 at 18:56 +0200, Daniel Vetter wrote:
> > Presuming I'm still following we should be able to fix this with the
> > new sleep state TASK_DEADLOCK and a flag somewhere in the thread info
> > (let's call it PF_GTFO for simplicity).
> 
> I'm reading "Get The F*ck Out" ? I like the name, except PF_flags are
> unsuitable since they are not atomic and we'd need to set it from
> another thread.

I think the PF_ flag is a non-starter for a different reason, too. To make
different clases of ww_mutexes nest properly, we need to make sure that we
don't dead/live-lock trying to wake up a task holdinga ww_mutex of class
A, while that task is trying to acquire ww_mutexes of class B. Picture:

	task 1			task 2			task 3
							holds lock B
	ticket_A = acquire_start(class A)
	ww_mutex_lock(lock A, ticket_A)			busy doing something

	ticket_B = acquire_start(class B)
	ww_mutex_lock(lock B, ticket_B)
	-> contention with task 3
				
				ticket_task2 = acquire_start(class A)
				ww_mutex_lock(lock A, ticket_task2)
				-> contention with task 1

If we go with the PF_GTFO task flag, task 2 will set it on task 1
(handwave locking/atomicity again here ;-). But taks 1 will only back off
from any locks in class B. Or at least I think we should impose that
restriction, since otherwise a ww_mutex acquire loop will leak out badly
abstraction-wise and making nesting pretty much impossible in practice.

But if we really want nesting to work transparently, then task 1 should be
allowed to continue taking locks from class A (after an acquire_end(class
B) call ofc). But then it will have missed the wakeup from task 2, so task
2 blocks for too long and task 1 doesn't back off from further acquiring
locks in class A.

Presuming my reasoning for the rt case isn't broken, this break deadlock
avoidance if we sort by (rt_prio, ticket).

So I think we need to move the PF_GTFO flag to the ticket to make sure
that task1 will notice that there's contention with one of the locks it
holds from class A and that it better gtfo asap (i.e. back off, drop all
currently held locks in class A and go into the slowpath).

But I still need to think the nesting case through a bit more (and draw
some diagrams), so not sure yet. One thing I still need to ponder is how
to best stack tickets (for tasks doing nested ww_mutex locking) and where
all we need a pointer to relevant tickets and what kind of fun this
entails ... I need to ponder this some more.

> >  Then every time a task does a
> > blocking wait on a ww_mutex it would set this special sleep state and
> > also check the PF_GTFO bit.
> 
> So its the contending task (O for B) setting PF_GTFO on the owning task
> (M for B), right?

Yeah, the idea is that the contending task sets this bit on the current
holder (whether we put that bit into the task or the ticket), so that the
current owner can back off and drop all currently held locks at the next
opportunity (either due to being woken up in TASK_DEADLCOK state or in the
next ww_mutex_lock call).

> But yeah, all ww_mutex sleep states should have the new TASK_DEADLOCK
> sleep state added.
> 
> >  If the later is set, it bails out with
> > -EAGAIN (so that all locks are dropped).
> 
> I would really rather see -EDEADLK for that..

I agree it makes more sense for a general api (outside of ttm), but I
struggle a bit to come up with a good errno for the "you hold this lock
already" case. Best I could do is -EBUSY ...

Hm, looking through errno.h I've just spotted -EALREADY. Seems to be used
all over (not just networking), so I guess we could extend it's meaning a
bit?

> > Now if a task wants to take a lock and notices that it's held by a
> > younger locker it can set that flag and wake the thread up (need to
> > think about all the races a bit, but we should be able to make this
> > work). Then it can do the normal blocking mutex slowpath and wait for
> > the unlock.
> 
> Right.
> 
> > Now if O and M race a bit against each another M should either get
> > woken (if it's already blocked on Y) and back off, or notice that the
> > thread flag is set before it even tries to grab another mutex 
> 
> ww_mutex, it should block just fine on regular mutexes and other
> primitives.

Yes, that special behaviour should only apply to ww_mutexes not to any
other locking. I've not been terribly careful here ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/