[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240618133127.GF791043@ziepe.ca>
Date: Tue, 18 Jun 2024 10:31:27 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Zong Li <zong.li@...ive.com>
Cc: Baolu Lu <baolu.lu@...ux.intel.com>, joro@...tes.org, will@...nel.org,
robin.murphy@....com, tjeznach@...osinc.com,
paul.walmsley@...ive.com, palmer@...belt.com, aou@...s.berkeley.edu,
kevin.tian@...el.com, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, linux-riscv@...ts.infradead.org
Subject: Re: [RFC PATCH v2 04/10] iommu/riscv: add iotlb_sync_map operation
support
On Tue, Jun 18, 2024 at 11:01:48AM +0800, Zong Li wrote:
> On Mon, Jun 17, 2024 at 10:39 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
> >
> > On Mon, Jun 17, 2024 at 09:43:35PM +0800, Zong Li wrote:
> >
> > > I added it for updating the MSI mapping when we change the irq
> > > affinity of a pass-through device to another vCPU. The RISC-V IOMMU
> > > spec allows MSI translation to go through the MSI flat table, MRIF, or
> > > the normal page table. In the case of the normal page table, the MSI
> > > mapping is created in the second-stage page table, mapping the GPA of
> > > the guest's supervisor interrupt file to the HPA of host's guest
> > > interrupt file. This MSI mapping needs to be updated when the HPA of
> > > host's guest interrupt file is changed.
> >
> > It sounds like more thought is needed for the MSI architecture, having
> > the host read the guest page table to mirror weird MSI stuff seems
> > kind of wrong..
>
> Perhaps I should rephrase it. Host doesn't read the guest page table.
> In a RISC-V system, MSIs are directed to a specific privilege level of
> a specific hart, including a specific virtual hart. In a hart's IMSIC
> (Incoming MSI Controller), it contains some 'interrupt files' for
> these specific privilege level harts. For instance, if the target
> address of MSI is the address of the interrupt file which is for a
> specific supervisor level hart, then that hart's supervisor mode will
> receive this MSI. Furthermore, when a hart implements the hypervisor
> extension, its IMSIC will have interrupt files for virtual harts,
> called 'guest interrupt files'.
> We will create the MSI mapping in S2 page table at boot time firstly,
> the mapping would be GPA of the interrupt file for supervisor level
> (in guest view, it thinks it use a supervisor level interrupt file) to
> HPA of the 'guest interrupt file' (in host view, the device should
> actually use a guest interrupt file). When the vCPU is migrated to
> another physical hart, the 'guest interrupt files' should be switched
> to another physical hart's IMSIC's 'guest interrupt file', it means
> that the HPA of this MSI mapping in S2 page table needs to be updated.
I am vaugely aware of these details, but it is good to hear them again.
However, none of that really explains why this is messing with
invalidation logic..
If you need to replace MSI pages in the S2 atomicaly as you migrate
vCPUs then you need a proper replace operation for the io page table.
map is supposed to fail if there are already mappings at that address,
you can't use it to replace existing mappings with something else.
Jason
Powered by blists - more mailing lists