lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200727095110.GG119549@hirez.programming.kicks-ass.net>
Date:   Mon, 27 Jul 2020 11:51:10 +0200
From:   peterz@...radead.org
To:     Peter Oskolkov <posk@...k.io>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Ingo Molnar <mingo@...nel.org>,
        Darren Hart <dvhart@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Peter Oskolkov <posk@...gle.com>,
        Andrei Vagin <avagin@...gle.com>, Paul Turner <pjt@...gle.com>,
        Ben Segall <bsegall@...gle.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Waiman Long <longman@...hat.com>
Subject: Re: [PATCH for 5.9 1/3] futex: introduce FUTEX_SWAP operation

On Thu, Jul 23, 2020 at 05:25:05PM -0700, Peter Oskolkov wrote:
> On Thu, Jul 23, 2020 at 4:28 AM Peter Zijlstra <peterz@...radead.org> wrote:

> > What worries me is how FUTEX_SWAP would interact with the future
> > FUTEX_LOCK / FUTEX_UNLOCK. When we implement pthread_mutex with those,
> > there's very few WAIT/WAKE left.
> 
> [+cc Waiman Long]
> 
> I've looked through the latest FUTEX_LOCK patchset I could find (
> https://lore.kernel.org/patchwork/cover/772643/ and related), and it seems
> that FUTEX_SWAP and FUTEX_LOCK/FUTEX_UNLOCK patchsets
> address the same issue (slow wakeups) but for different use cases:
> 
> FUTEX_LOCK/FUTEX_UNLOCK uses spinning and lock stealing to
> improve futex wake/wait performance in high contention situations;

The spinning is best for light contention.

> FUTEX_SWAP is designed to be used for fast context switching with
> _no_ contention by design: the waker that is going to sleep, and the wakee
> are using different futexes; the userspace will have a futex per thread/task,
> and when needed the thread/task will either simply sleep on its futex,
> or context switch (=FUTEX_SWAP) into a different thread/task.

Right, but how can you tell what the next thing after UNLOCK is going to
be.. that's going to be tricky.

> I can also imagine that instead of combining WAIT/WAKE for
> fast context switching, a variant of FUTEX_SWAP can use LOCK/UNLOCK
> operations in the future, when these are available; but again, I fully
> expect that
> a single "FUTEX_LOCK the current task on futex A, FUTEX_UNLOCK futex B,
> context switch into the wakee" futex op will be much faster than doing
> the same thing
> in two syscalls, as FUTEX_LOCK/FUTEX_UNLOCK does not seem to be concerned
> with fast waking of a sleeping task, but more with minimizing sleeping
> in the first place.

Correct; however the reason I like LOCK/UNLOCK is that it exposes the
blocking relations to the kernel -- and that ties into yet another
unfinished patch-set :-/

  https://lkml.kernel.org/r/20181009092434.26221-1-juri.lelli@redhat.com

> What will be faster: FUTEX_SWAP that does
>    FUTEX_WAKE (futex A) + FUTEX_WAIT (current, futex B),
> or FUTEX_SWAP that does
>    FUTEX_UNLOCK (futex A) + FUTEX_LOCK (current, futex B)?

Well, perhaps both argue against having SWAP but instead having compound
futex ops. Something I think we're already starting to see with the new
futex API patches posted here:

  https://lkml.kernel.org/r/20200709175921.211387-1-andrealmeid@collabora.com

sys_futex_waitv() is effectively a whole bunch of WAIT ops combined.

> As wake+wait will always put the waker to sleep, it means that
> there will be a true context switch on the same CPU on the fast path;
> on the other hand, unlock+lock will potentially evade sleeping,
> so the wakee will often run on a different CPU (with the waker
> spinning instead of sleeping?), thus not benefitting from cache locality
> that fast context switching on the same CPU is meant to use...
> 
> I'll add some of the considerations above to the expanded cover letter
> (or a commit message).

It's Monday morning, so perhaps I'm making a mess of things, but
something like the below, where our thread t2 issues synchronous work to
t1:


	t1		t2
			LOCK A
	LOCK B
1:	LOCK A

			...


			UNLOCK A
			LOCK B
	...

	UNLOCK B
	UNLOCK A
	LOCK B
	GOTO 1
			UNLOCK B
			LOCK A
			...

Is an abuse of mutexes, that is, it implements completions using a
(fair) mutex. A guards the work-queue, B is the 'completion'.

Then, if you teach it that a compound UNLOCK-A + LOCK-B, where
the lock owner of B is on the wait list of A is a 'SWAP', should get you
the desired semantics, I think.

You can do SWAP and you get to have an exposed blocking relation.

Is this exactly what we want, I don't know. Because I'm not entirely
sure what problem we're solving. Why would you be setting things up like
that in the first place. IIRC you're using this to implement coroutines
in golang, and I'm not sure I have a firm enough grasp of all that to
make a cogent suggestion one way or the other.

> > Also, why would we commit to an ABI without ever having seen the rest?
> 
> I'm not completely sure what you mean here. We do not envision any
> expansion/changes to the ABI proposed here,

Well, you do not, but how can we verify this without having a complete
picture. Also, IIRC, there's a bunch of scheduler patches that goes on
top to implement the fast switch.

Also, how does this compare to some of the glorious hacks found in
GNU Pth? You can implement M:N threading using those as well.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ