lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e427b102d8fd899a9a3db2ec17a628beb24bc01.camel@infradead.org>
Date: Fri, 26 Jul 2024 09:35:51 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Richard Cochran <richardcochran@...il.com>, Peter Hilber
 <peter.hilber@...nsynergy.com>, linux-kernel@...r.kernel.org, 
 virtualization@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org, 
 linux-rtc@...r.kernel.org, "Ridoux, Julien" <ridouxj@...zon.com>, 
 virtio-dev@...ts.linux.dev, "Luu, Ryan" <rluu@...zon.com>, "Chashper,
 David" <chashper@...zon.com>, "Mohamed Abuelfotoh, Hazem"
 <abuehaze@...zon.com>,  "Christopher S . Hall"
 <christopher.s.hall@...el.com>, Jason Wang <jasowang@...hat.com>, John
 Stultz <jstultz@...gle.com>,  netdev@...r.kernel.org, Stephen Boyd
 <sboyd@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Xuan Zhuo
 <xuanzhuo@...ux.alibaba.com>, Marc Zyngier <maz@...nel.org>, Mark Rutland
 <mark.rutland@....com>, Daniel Lezcano <daniel.lezcano@...aro.org>,
 Alessandro Zummo <a.zummo@...ertech.it>,  Alexandre Belloni
 <alexandre.belloni@...tlin.com>, qemu-devel <qemu-devel@...gnu.org>, Simon
 Horman <horms@...nel.org>
Subject: Re: [PATCH] ptp: Add vDSO-style vmclock support

On Fri, 2024-07-26 at 02:06 -0400, Michael S. Tsirkin wrote:
> On Thu, Jul 25, 2024 at 11:20:56PM +0100, David Woodhouse wrote:
> > We're rolling out the AMZNVCLK device for internal use cases, and plan
> > to add it in public instances some time later.
> 
> Let's be real. If amazon does something in its own hypervisor, and the
> only way to use that is to expose the interface to userspace, there is
> very little the linux community can do.  Moreover, userspace will be
> written to this ABI, and be locked in to the specific hypervisor. It
> might be a win for amazon short term but long term you will want to
> extend things and it will be a mess.
> 
> So I feel you have chosen ACPI badly.  It just does not have the APIs
> that you need. Virtio does, and would not create a userpspace lock-in
> to a specific hypervisor. It's not really virtio specific either,
> you can write a bare pci device with a BAR and a bunch of msix
> vectors and it will get you the same effect.

I *am* as bad as the next person for taking the "I have a hammer,
therefore everything is a nail" approach. For you that hammer is
virtio, and I respect that. But mine isn't ACPI — quite the opposite,
it's DT.

I *hate* ACPI. I hate everything about it. I hate that Arm started
using it for Arm64 instead of going with Device Tree.

That's why we have the DSM method for obtaining properties, and the
PRP0001 ACPI HID which means "look for the compatible property and
treat it like a DT node". So people can make DT bindings and hey, if
you're on a system which is afflicted with ACPI, you can still use
them. Which I'm still proselytising today, as you saw.

But for this use case, we only need a memory region that the hypervisor
can update. We don't need any of that complexity of gratuitously
interrupting all the vCPUs just to ensure that none of them can be
running userspace while one of them does an update for itself,
potentially translating from one ABI to another. The hypervisor can
just update the user-visible memory in place.

In this case, exposing a simple MMIO memory region in _CRS of an ACPI
device was the simplest and most compatible solution. 

Yes, we can add a virtio transport for that where the hypervisor is
invited to DMA into (unencrypted) guest memory, and it solves the
PAGE_SIZE problem of the trivial ACPI method. But there's still a place
in this world for the ACPI method, and it doesn't *hurt* virtio.

The important part is the vmclock_abi structure; the transport is just
fluff. And I do not agree that this is a lock-in to a specific
hypervisor. I've literally rewritten the fields in the structure to
align to what virtio-rtc does and accommodate Peter's feedback (to the
dismay of my internal team who just wanted to stick with the initial
straw man struct and didn't want to keep up, and haven't even engaged
with the public threads which have been ongoing since March¹, even when
I've beaten them with a big stick). I've added a QEMU implementation
too. We absolutely *don't* want this to be hypervisor-specific.


¹ https://lore.kernel.org/all/0e21e3e2be26acd70b5575b9932b3a911c9fe721.camel@infradead.org

Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ