[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2055788870.20749.1595263590675.JavaMail.zimbra@efficios.com>
Date: Mon, 20 Jul 2020 12:46:30 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Nicholas Piggin <npiggin@...il.com>
Cc: Anton Blanchard <anton@...abs.org>, Arnd Bergmann <arnd@...db.de>,
linux-arch <linux-arch@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Andy Lutomirski <luto@...capital.net>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, x86 <x86@...nel.org>,
Jens Axboe <axboe@...nel.dk>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than
membarrier_mm_sync_core_before_usermode
----- On Jul 19, 2020, at 11:03 PM, Nicholas Piggin npiggin@...il.com wrote:
> Excerpts from Mathieu Desnoyers's message of July 17, 2020 11:42 pm:
>> ----- On Jul 16, 2020, at 7:26 PM, Nicholas Piggin npiggin@...il.com wrote:
>> [...]
>>>
>>> membarrier does replace barrier instructions on remote CPUs, which do
>>> order accesses performed by the kernel on the user address space. So
>>> membarrier should too I guess.
>>>
>>> Normal process context accesses like read(2) will do so because they
>>> don't get filtered out from IPIs, but kernel threads using the mm may
>>> not.
>>
>> But it should not be an issue, because membarrier's ordering is only with
>> respect
>> to submit and completion of io_uring requests, which are performed through
>> system calls from the context of user-space threads, which are called from the
>> right mm.
>
> Is that true? Can io completions be written into an address space via a
> kernel thread? I don't know the io_uring code well but it looks like
> that's asynchonously using the user mm context.
Indeed, the io completion appears to be signaled asynchronously between kernel
and user-space. Therefore, both kernel and userspace code need to have proper
memory barriers in place to signal completion, otherwise user-space could read
garbage after it notices completion of a read.
I did not review the entire io_uring implementation, but the publish side
for completion appears to be:
static void __io_commit_cqring(struct io_ring_ctx *ctx)
{
struct io_rings *rings = ctx->rings;
/* order cqe stores with ring update */
smp_store_release(&rings->cq.tail, ctx->cached_cq_tail);
if (wq_has_sleeper(&ctx->cq_wait)) {
wake_up_interruptible(&ctx->cq_wait);
kill_fasync(&ctx->cq_fasync, SIGIO, POLL_IN);
}
}
The store-release on tail should be paired with a load_acquire on the
reader-side (it's called "read_barrier()" in the code):
tools/io_uring/queue.c:
static int __io_uring_get_cqe(struct io_uring *ring,
struct io_uring_cqe **cqe_ptr, int wait)
{
struct io_uring_cq *cq = &ring->cq;
const unsigned mask = *cq->kring_mask;
unsigned head;
int ret;
*cqe_ptr = NULL;
head = *cq->khead;
do {
/*
* It's necessary to use a read_barrier() before reading
* the CQ tail, since the kernel updates it locklessly. The
* kernel has the matching store barrier for the update. The
* kernel also ensures that previous stores to CQEs are ordered
* with the tail update.
*/
read_barrier();
if (head != *cq->ktail) {
*cqe_ptr = &cq->cqes[head & mask];
break;
}
if (!wait)
break;
ret = io_uring_enter(ring->ring_fd, 0, 1,
IORING_ENTER_GETEVENTS, NULL);
if (ret < 0)
return -errno;
} while (1);
return 0;
}
So as far as membarrier memory ordering dependencies are concerned, it relies
on the store-release/load-acquire dependency chain in the completion queue to
order against anything that was done prior to the completed requests.
What is in-flight while the requests are being serviced provides no memory
ordering guarantee whatsoever.
> How about other memory accesses via kthread_use_mm? Presumably there is
> still ordering requirement there for membarrier,
Please provide an example case with memory accesses via kthread_use_mm where
ordering matters to support your concern.
> so I really think
> it's a fragile interface with no real way for the user to know how
> kernel threads may use its mm for any particular reason, so membarrier
> should synchronize all possible kernel users as well.
I strongly doubt so, but perhaps something should be clarified in the documentation
if you have that feeling.
Thanks,
Mathieu
>
> Thanks,
> Nick
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists