[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CBEFE3E@AcuExch.aculab.com>
Date: Fri, 18 Dec 2015 10:15:33 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Alex Williamson' <alex.williamson@...hat.com>,
Yongji Xie <xyjxie@...ux.vnet.ibm.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
CC: "nikunj@...ux.vnet.ibm.com" <nikunj@...ux.vnet.ibm.com>,
"zhong@...ux.vnet.ibm.com" <zhong@...ux.vnet.ibm.com>,
"aik@...abs.ru" <aik@...abs.ru>,
"paulus@...ba.org" <paulus@...ba.org>,
"warrier@...ux.vnet.ibm.com" <warrier@...ux.vnet.ibm.com>
Subject: RE: [RFC PATCH 3/3] vfio-pci: Allow to mmap MSI-X table if EEH is
supported
From: Alex Williamson
> Sent: 17 December 2015 21:07
...
> > Is this all related to the statements in the PCI(e) spec that the
> > MSI-X table and Pending bit array should in their own BARs?
> > (ISTR it even suggests a BAR each.)
> >
> > Since the MSI-X table exists in device memory/registers there is
> > nothing to stop the device modifying the table contents (or even
> > ignoring the contents and writing address+data pairs that are known
> > to reference the CPUs MSI-X interrupt generation logic).
> >
> > We've an fpga based PCIe slave that has some additional PCIe slaves
> > (associated with the interrupt generation logic) that are currently
> > next to the PBA (which is 8k from the MSI-X table).
> > If we can't map the PBA we can't actually raise any interrupts.
> > The same would be true if page size is 64k and mapping the MSI-X
> > table banned.
> >
> > Do we need to change our PCIe slave address map so we don't need
> > to access anything in the same page (which might be 64k were we to
> > target large ppc - which we don't at the moment) as both the
> > MSI-X table and the PBA?
> >
> > I'd also note that being able to read the MSI-X table is a useful
> > diagnostic that the relevant interrupts are enabled properly.
>
> Yes, the spec requirement is that MSI-X structures must reside in a 4k
> aligned area that doesn't overlap with other configuration registers
> for the device. It's only an advisement to put them into their own
> BAR, and 4k clearly wasn't as forward looking as we'd hope. Vfio
> doesn't particularly care about the PBA, but if it resides in the same
> host PAGE_SIZE area as the MSI-X vector table, you currently won't be
> able to get to it. Most devices are not at all dependent on the PBA
> for any sort of functionality.
Having some hint in the spec as to why these mapping rules might be
needed would have been useful.
> It's really more correct to say that both the vector table and PBA are
> emulated by QEMU than paravirtualized. Only PPC64 has the guest OS
> taking a paravirtual path to program the vector table, everyone else
> attempts to read/write to the device MMIO space, which gets trapped and
> emulated in QEMU. This is why the QEMU side patch has further ugly
> hacks to mess with the ordering of MemoryRegions since even if we can
> access and mmap the MSI-X vector table, we'll still trap into QEMU for
> emulation.
Thanks for that explanation.
> How exactly does the ability to map the PBA affect your ability to
> raise an interrupt?
There are other registers for the logic block that converts internal
interrupt requests into the PCIe writes in the locations following the PBA.
These include interrupt enable bits, and the ability to remove pending
interrupt requests (and other stuff for testing the interrupt block).
Yes I know I probably shouldn't have done that, but it all worked.
At least it is better than the previous version of the hardware that
required the driver read back the MSI-X table entries in order to
set up an on-board PTE to convert a 32bit on-board address to the
64bit PCIe address needed for the MSI-X.
I'll 'fix' our board by making both the MSI-X table and PBA area
accessible through one of the other BARs. (Annoyingly this will
be confusing because the BAR offsets will have to differ.)
Note that this makes a complete mockery of disallowing the mapping
in the first place.
David
Powered by blists - more mailing lists