[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALs-Hsu58iOrxKKKu-rQBszz3F--657G-zipBu5zZCxzPWRPWw@mail.gmail.com>
Date: Tue, 28 Mar 2023 15:53:01 -0700
From: Evan Green <evan@...osinc.com>
To: Heiko Stübner <heiko@...ech.de>
Cc: Palmer Dabbelt <palmer@...osinc.com>, slewis@...osinc.com,
vineetg@...osinc.com, Conor Dooley <conor@...nel.org>,
Albert Ou <aou@...s.berkeley.edu>,
Andrew Bresticker <abrestic@...osinc.com>,
Andrew Jones <ajones@...tanamicro.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Anup Patel <apatel@...tanamicro.com>,
Arnd Bergmann <arnd@...db.de>,
Atish Patra <atishp@...osinc.com>,
Bagas Sanjaya <bagasdotme@...il.com>,
Catalin Marinas <catalin.marinas@....com>,
Celeste Liu <coelacanthus@...look.com>,
Conor Dooley <conor.dooley@...rochip.com>,
Dao Lu <daolu@...osinc.com>, Guo Ren <guoren@...nel.org>,
Jann Horn <jannh@...gle.com>,
Jisheng Zhang <jszhang@...nel.org>,
Jonathan Corbet <corbet@....net>,
Ley Foon Tan <leyfoon.tan@...rfivetech.com>,
Mark Brown <broonie@...nel.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Nathan Chancellor <nathan@...nel.org>,
Palmer Dabbelt <palmer@...belt.com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Peter Xu <peterx@...hat.com>,
Philipp Tomsich <philipp.tomsich@...ll.eu>,
Randy Dunlap <rdunlap@...radead.org>,
Samuel Holland <samuel@...lland.org>,
Shuah Khan <shuah@...nel.org>,
Sunil V L <sunilvl@...tanamicro.com>,
Tobias Klauser <tklauser@...tanz.ch>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v5 0/6] RISC-V Hardware Probing User Interface
On Tue, Mar 28, 2023 at 1:35 PM Heiko Stübner <heiko@...ech.de> wrote:
>
> Am Montag, 27. März 2023, 18:31:57 CEST schrieb Evan Green:
> >
> > There's been a bunch of off-list discussions about this, including at
> > Plumbers. The original plan was to do something involving providing an
> > ISA string to userspace, but ISA strings just aren't sufficient for a
> > stable ABI any more: in order to parse an ISA string users need the
> > version of the specifications that the string is written to, the version
> > of each extension (sometimes at a finer granularity than the RISC-V
> > releases/versions encode), and the expected use case for the ISA string
> > (ie, is it a U-mode or M-mode string). That's a lot of complexity to
> > try and keep ABI compatible and it's probably going to continue to grow,
> > as even if there's no more complexity in the specifications we'll have
> > to deal with the various ISA string parsing oddities that end up all
> > over userspace.
> >
> > Instead this patch set takes a very different approach and provides a set
> > of key/value pairs that encode various bits about the system. The big
> > advantage here is that we can clearly define what these mean so we can
> > ensure ABI stability, but it also allows us to encode information that's
> > unlikely to ever appear in an ISA string (see the misaligned access
> > performance, for example). The resulting interface looks a lot like
> > what arm64 and x86 do, and will hopefully fit well into something like
> > ACPI in the future.
> >
> > The actual user interface is a syscall, with a vDSO function in front of
> > it. The vDSO function can answer some queries without a syscall at all,
> > and falls back to the syscall for cases it doesn't have answers to.
> > Currently we prepopulate it with an array of answers for all keys and
> > a CPU set of "all CPUs". This can be adjusted as necessary to provide
> > fast answers to the most common queries.
> >
> > An example series in glibc exposing this syscall and using it in an
> > ifunc selector for memcpy can be found at [1]. I'm about to send a v2
> > of that series out that incorporates the vDSO function.
> >
> > I was asked about the performance delta between this and something like
> > sysfs. I created a small test program [2] and ran it on a Nezha D1
> > Allwinner board. Doing each operation 100000 times and dividing, these
> > operations take the following amount of time:
> > - open()+read()+close() of /sys/kernel/cpu_byteorder: 3.8us
> > - access("/sys/kernel/cpu_byteorder", R_OK): 1.3us
> > - riscv_hwprobe() vDSO and syscall: .0094us
> > - riscv_hwprobe() vDSO with no syscall: 0.0091us
>
> Looks like this series spawned a thread on one of the riscv-lists [0].
>
> As auxvals were mentioned in that thread, I was wondering what's the
> difference between doing a new syscall vs. putting the keys + values as
> architecture auxvec elements [1] ?
The auxvec approach would also work. The primary difference is that
auxvec bits are actively copied into every new process, forever. If
you predict a slow pace of new bits coming in, the auxvec approach
probably makes more sense. This series was born out of a prediction
that this set of "stuff" was going to be larger than traditional
x86/ARM architectures, fiddly (ie bits possibly representing specific
versions of various extensions), evolving regularly over time, and
heterogeneous between cores. With that sort of rubber band ball in
mind, a key/value interface seemed to make more sense.
-Evan
Powered by blists - more mailing lists