lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220806081732.a553jsoe2sfwghjg@sgarzare-redhat>
Date:   Sat, 6 Aug 2022 10:17:32 +0200
From:   Stefano Garzarella <sgarzare@...hat.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>, mst@...hat.com,
        jasowang@...hat.com
Cc:     Will Deacon <will@...nel.org>, stefanha@...hat.com,
        ascull@...gle.com, maz@...nel.org, keirf@...gle.com,
        jiyong@...gle.com, kernel-team@...roid.com,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org
Subject: Re: IOTLB support for vhost/vsock breaks crosvm on Android

Hi Linus,

On Fri, Aug 05, 2022 at 03:57:08PM -0700, Linus Torvalds wrote:
>On Fri, Aug 5, 2022 at 11:11 AM Will Deacon <will@...nel.org> wrote:
>>
>> [tl;dr a change from ~18 months ago breaks Android userspace and I don't
>>  know what to do about it]
>
>Augh.
>
>I had hoped that android being "closer" to upstream would have meant
>that somebody actually tests android with upstream kernels. People
>occasionally talk about it, but apparently it's not actually done.
>
>Or maybe it's done onl;y with a very limited android user space.
>
>The whole "we notice that something that happened 18 months ago broke
>our environment" is kind of broken.
>
>> After some digging, we narrowed this change in behaviour down to
>> e13a6915a03f ("vhost/vsock: add IOTLB API support") and further digging
>> reveals that the infamous VIRTIO_F_ACCESS_PLATFORM feature flag is to
>> blame. Indeed, our tests once again pass if we revert that patch (there's
>> a trivial conflict with the later addition of VIRTIO_VSOCK_F_SEQPACKET
>> but otherwise it reverts cleanly).
>
>I have to say, this smells for *so* many reasons.
>
>Why is "IOMMU support" called "VIRTIO_F_ACCESS_PLATFORM"?
>
>That seems insane, but seems fundamental in that commit e13a6915a03f
>("vhost/vsock: add IOTLB API support")
>
>This code
>
>        if ((features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) {
>                if (vhost_init_device_iotlb(&vsock->dev, true))
>                        goto err;
>        }
>
>just makes me go "What?"  It makes no sense. Why isn't that feature
>called something-something-IOTLB?

I honestly don't know the reason for the name but 
VIRTIO_F_ACCESS_PLATFORM comes from the virtio specification:
   https://docs.oasis-open.org/virtio/virtio/v1.2/cs01/virtio-v1.2-cs01.html#x1-6600006

   VIRTIO_F_ACCESS_PLATFORM(33)
      This feature indicates that the device can be used on a platform
      where device access to data in memory is limited and/or translated.
      E.g. this is the case if the device can be located behind an IOMMU
      that translates bus addresses from the device into physical
      addresses in memory, if the device can be limited to only access
      certain memory addresses or if special commands such as a cache
      flush can be needed to synchronise data in memory with the device.
      Whether accesses are actually limited or translated is described by
      platform-specific means. If this feature bit is set to 0, then the
      device has same access to memory addresses supplied to it as the
      driver has. In particular, the device will always use physical
      addresses matching addresses used by the driver (typically meaning
      physical addresses used by the CPU) and not translated further, and
      can access any address supplied to it by the driver. When clear,
      this overrides any platform-specific description of whether device
      access is limited or translated in any way, e.g. whether an IOMMU
      may be present.

>
>Can we please just split that flag into two, and have that odd
>"platform access" be one bit, and the "enable iommu" be an entirely
>different bit?

IIUC the problem here is that the VMM does the translation and then for 
the device there is actually no need to translate, so this feature 
should not be negotiated by crosvm and vhost-vsock, but just between 
guest's driver and crosvm.

Perhaps the confusion is that we use VIRTIO_F_ACCESS_PLATFORM both 
between guest and VMM and between VMM and vhost device.

In fact, prior to commit e13a6915a03f ("vhost/vsock: add IOTLB API 
support"), vhost-vsock did not work when a VMM (e.g., QEMU) tried to 
negotiate translation with the device: 
https://bugzilla.redhat.com/show_bug.cgi?id=1894101

The simplest solution is that crosvm doesn't negotiate 
VIRTIO_F_ACCESS_PLATFORM with the vhost-vsock device if it doesn't want 
to use translation and send messages to set it.

In fact before commit e13a6915a03f ("vhost/vsock: add IOTLB API 
support") this feature was not exposed by the vhost-vsock device, so it 
was never negotiated. Now crosvm is enabling a new feature (not masking 
guest-negotiated features) so I don't think it's a break in user space, 
if the user space enable it.

I tried to explain what I understood when I made the change, Michael and 
Jason surely can add more information.

Thanks,
Stefano

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ