lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Aug 2021 11:34:51 -0700
From:   Russ Weight <russell.h.weight@...el.com>
To:     Alex Williamson <alex.williamson@...hat.com>
CC:     Cornelia Huck <cohuck@...hat.com>,
        "Adler, Michael" <michael.adler@...el.com>,
        "Whisonant, Tim" <tim.whisonant@...el.com>, <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Tom Rix <trix@...hat.com>
Subject: BUG REPORT: vfio_pci driver

Bug Description:

A bug in the vfio_pci driver was reported in junction with work on FPGA
cards. We were able to reproduce and root-cause the bug using system-tap.
The original bug description is below. An understanding of the referenced
dfl and opae tools is not required - it is the sequence of IOCTL calls and
IRQ vectors that matters:

> I’m trying to get an example AFU working that uses 2 IRQs, active at the same 
> time. I’m hitting what looks to be a dfl_pci driver bug.
>
> The code tries to allocate two IRQ vectors: 0 and 1. I see opaevfio.c doing the 
> right thing, picking the MSIX index. Allocating either IRQ 0 or IRQ 1 works fine 
> and I confirm that the VFIO_DEVICE_SET_IRQS looks reasonable, choosing MSIX and 
> either start of 0 or 1 and count 1.
>
> Note that opaevfio.c always passes count 1, so it will make separate calls for 
> each IRQ vector.
>
> When I try to allocate both, I see the following:
>
>   * If the VFIO_DEVICE_SET_IRQS ioctl is called first with start 0 and then
>     start 1 (always count 1), the start 1 (second) ioctl trap returns EINVAL.
>   * If I set up the vectors in decreasing order, so start 1 followed by start 0,
>     the program works!
>   * I ruled out OPAE SDK user space problems by setting up my program to
>     allocate in increasing order, which would normally fail. I changed only the
>     ioctl call in user space opaevfio.c, inverting bit 0 of start so that the
>     driver is called in decreasing index order. Of course this binds the wrong
>     vectors to the fds, but I don’t care about that for now. This works! From
>     this, I conclude that it can’t be a user space problem since the difference
>     between working and failing is solely the order in which IRQ vectors are
>     bound in ioctl calls.

The EINVAL is coming from vfio_msi_set_block() here:
https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L373

vfio_msi_set_block() is being called from vfio_pci_set_msi_trigger() here on
the second IRQ request:
https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L530

We believe the bug is in vfio_pci_set_msi_trigger(), in the 2nd parameter to the call
to vfio_msi_enable() here:
https://github.com/torvalds/linux/blob/master/drivers/vfio/pci/vfio_pci_intrs.c#L533

In both the passing and failing cases, the first IRQ request results in a call
to vfio_msi_enable() at line 533 and the second IRQ request results in the
call to vfio_msi_set_block() at line 530. It is during the first IRQ request
that vfio_msi_enable() sets vdev->num_ctx based on the 2nd parameter (nvec).
vdev->num_ctx is part of the conditional that results in the EINVAL for the
failing case.

In the passing case, vdev->num_ctx is 2. In the failing case, it is 1.

I am attaching two text files containing trace information from systemtap: one for
the failing case and one for the passing case. They contain a lot more information
than is needed, but if you search for vfio_pci_set_msi_trigger and vfio_msi_set_block,
you will see values for some of the call parameters.

- Russ


View attachment "vfio_pci_pass.txt" of type "text/plain" (24082 bytes)

View attachment "vfio_pci_fail.txt" of type "text/plain" (21203 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ