[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d8e3bb4-0cef-b991-9a16-1f03d10f131d@gmail.com>
Date: Sat, 5 Jun 2021 11:56:13 +0300
From: Andrey Semashev <andrey.semashev@...il.com>
To: Nicholas Piggin <npiggin@...il.com>,
André Almeida <andrealmeid@...labora.com>
Cc: acme@...nel.org, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
corbet@....net, Davidlohr Bueso <dave@...olabs.net>,
Darren Hart <dvhart@...radead.org>, fweimer@...hat.com,
joel@...lfernandes.org, kernel@...labora.com,
krisman@...labora.com, libc-alpha@...rceware.org,
linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, malteskarupke@...tmail.fm,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
pgriffais@...vesoftware.com, Peter Oskolkov <posk@...k.io>,
Steven Rostedt <rostedt@...dmis.org>, shuah@...nel.org,
Thomas Gleixner <tglx@...utronix.de>, z.figura12@...il.com
Subject: Re: [PATCH v4 00/15] Add futex2 syscalls
On 6/5/21 4:09 AM, Nicholas Piggin wrote:
> Excerpts from André Almeida's message of June 5, 2021 6:01 am:
>> Às 08:36 de 04/06/21, Nicholas Piggin escreveu:
>
>>> I'll be burned at the stake for suggesting it but it would be great if
>>> we could use file descriptors. At least for the shared futex, maybe
>>> private could use a per-process futex allocator. It solves all of the
>>> above, although I'm sure has many of its own problem. It may not play
>>> so nicely with the pthread mutex API because of the whole static
>>> initialiser problem, but the first futex proposal did use fds. But it's
>>> an example of an alternate API.
>>>
>>
>> FDs and futex doesn't play well, because for futex_wait() you need to
>> tell the kernel the expected value in the futex address to avoid
>> sleeping in a free lock. FD operations (poll, select) don't have this
>> `value` argument, so they could sleep forever, but I'm not sure if you
>> had taken this in consideration.
>
> I had. The futex wait API would take a fd additional. The only
> difference is the waitqueue that is used when a sleep or wake is
> required is derived from the fd, not from an address.
>
> I think the bigger sticking points would be if it's too heavyweight an
> object to use (which could be somewhat mitigated with a simpler ida
> allocator although that's difficult to do with shared), and whether libc
> could sanely use them due to the static initialiser problem of pthread
> mutexes.
The static initialization feature is not the only benefit of the current
futex design, and probably not the most important one. You can work
around the static initialization in userspace, e.g. by initializing fd
to an invalid value and creating a valid fd upon the first use. Although
that would still incur a performance penalty and add a new source of
failure.
What is more important is that waiting on fd always requires a kernel
call. This will be terrible for performance of uncontended locks, which
is the majority of time.
Another important point is that a futex that is not being waited on
consumes zero kernel resources while fd is a limited resource even when
not used. You can have millions futexes in userspace and you are
guaranteed not to exhaust any limit as long as you have memory. That is
an important feature, and the current userspace is relying on it by
assuming that creating mutexes and condition variables is cheap.
Having futex fd would be useful in some cases to be able to integrate
futexes with IO. I did have use cases where I would have liked to have
FUTEX_FD in the past. These cases arise when you already have a thread
that operates on fds and you want to avoid having a separate thread that
blocks on futexes in a similar fashion. But, IMO, that should be an
optional opt-in feature. By far, not every futex needs to have an fd.
For just waiting on multiple futexes, the native support that futex2
provides is superior.
PS: I'm not asking FUTEX_FD to be implemented as part of futex2 API.
futex2 would be great even without it.
Powered by blists - more mailing lists