netdev - Re: [PATCH V4 mlx5-next 13/13] vfio/mlx5: Use its own PCI reset

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20211027155339.GE2744544@nvidia.com>
Date:   Wed, 27 Oct 2021 12:53:39 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Alex Williamson <alex.williamson@...hat.com>
Cc:     Yishai Hadas <yishaih@...dia.com>, bhelgaas@...gle.com,
        saeedm@...dia.com, linux-pci@...r.kernel.org, kvm@...r.kernel.org,
        netdev@...r.kernel.org, kuba@...nel.org, leonro@...dia.com,
        kwankhede@...dia.com, mgurtovoy@...dia.com, maorg@...dia.com
Subject: Re: [PATCH V4 mlx5-next 13/13] vfio/mlx5: Use its own PCI reset_done
 error handler

On Wed, Oct 27, 2021 at 09:29:43AM -0600, Alex Williamson wrote:

> The reset_done handler sets deferred_reset = true and if it's possible
> to get the state_mutex, will reset migration data and device_state as
> part of releasing that mutex.  If there's contention on state_mutex,
> the deferred_reset field flags that this migration state is still stale.
> 
> So, I assume that it's possible that a user resets the device via ioctl
> or config space, there was contention and the migration state is still
> stale, right?

If this occurs it is a userspace bug and the goal here is to maintain
kernel integrity.

> The user then goes to read device_state, but the staleness of the
> migration state is not resolved until *after* the stale device state is
> copied to the user buffer.

This is not preventable in the general case. Assume we have sane
locking and it looks like this:

   CPU0                            CPU1
  ioctl state change
    mutex_lock
    copy_to_user(state == !RUNNING)
    mutex_unlock
                               ioctl reset
                                 mutex_lock
                                 state = RUNNING
                                 mutex_unlock
                               return to userspace
  return to userspace
  Userspace sees state != RUNNING

Same issue. Userspace cannot race state manipulating ioctls and expect
things to make any sense.

In all cases contention on the mutex during reset causes the reset to
order after the mutex is released. This is true with this approach and
it is true with a simple direct use of mutex.

In either case userspace will see incoherent results, and it is
userspace error to try and run the kernel ioctls this way.

> What did the user do wrong to see stale data?  Thanks,

Userspace allowed two state effecting IOCTLs to run concurrently.

Userspace must block reset while it is manipulating migration states.

Jason