[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAG48ez0c=ExHdoxQWqDN9hFAhwUKab8vgk-nJ-JGqTUm4xVUsw@mail.gmail.com>
Date: Tue, 24 Sep 2024 20:00:32 +0200
From: Jann Horn <jannh@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Suren Baghdasaryan <surenb@...gle.com>, linux-trace-kernel@...r.kernel.org,
peterz@...radead.org, oleg@...hat.com, rostedt@...dmis.org,
mhiramat@...nel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
jolsa@...nel.org, paulmck@...nel.org, akpm@...ux-foundation.org,
linux-mm@...ck.org, mjguzik@...il.com, brauner@...nel.org, andrii@...nel.org
Subject: Re: [PATCH v2 1/1] mm: introduce mmap_lock_speculation_{start|end}
On Tue, Sep 24, 2024 at 7:15 PM Matthew Wilcox <willy@...radead.org> wrote:
> On Fri, Sep 13, 2024 at 12:52:39AM +0200, Jann Horn wrote:
> > FWIW, I would still feel happier if this was a 64-bit number, though I
> > guess at least with uprobes the attack surface is not that large even
> > if you can wrap that counter... 2^31 counter increments are not all
> > that much, especially if someone introduces a kernel path in the
> > future that lets you repeatedly take the mmap_lock for writing within
> > a single syscall without doing much work, or maybe on some machine
> > where syscalls are really fast. I really don't like hinging memory
> > safety on how fast or slow some piece of code can run, unless we can
> > make strong arguments about it based on how many memory writes a CPU
> > core is capable of doing per second or stuff like that.
>
> You could repeatedly call munmap(1, 0) which will take the
> mmap_write_lock, do no work and call mmap_write_unlock(). We could
> fix that by moving the start/len validation outside the
> mmap_write_lock(), but it won't increase the path length by much.
> How many syscalls can we do per second?
> https://blogs.oracle.com/linux/post/syscall-latency suggests 217ns per
> syscall, so we'll be close to 4.6m syscalls/second or 466 seconds (7
> minutes, 46 seconds).
Yeah, that seems like a pretty reasonable guess.
One method that may or may not be faster would be to use an io-uring
worker to dispatch a bunch of IORING_OP_MADVISE operations - that
would save on syscall entry overhead but in exchange you'd have to
worry about feeding a constant stream of work into the worker thread
in a cache-efficient way, maybe by having one CPU constantly switch
back and forth between a userspace thread and a uring worker or
something like that.
Powered by blists - more mailing lists