[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7536443a-9871-49f3-a42b-28320dc57fc4@amd.com>
Date: Fri, 6 Sep 2024 21:08:06 +0700
From: "Suthikulpanit, Suravee" <suravee.suthikulpanit@....com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux.dev, joro@...tes.org,
robin.murphy@....com, vasant.hegde@....com, ubizjak@...il.com,
jon.grimm@....com, santosh.shukla@....com, pandoh@...gle.com,
kumaranand@...gle.com
Subject: Re: [PATCH v2 3/5] iommu/amd: Introduce helper functions to access
and update 256-bit DTE
On 9/6/2024 1:21 AM, Jason Gunthorpe wrote:
>>> I don't think you should restore, this should reflect a locking error
>>> but we still need to move forward and put some kind of correct
>>> data.. The code can't go backwards so it should try to move forwards..
>> In case of error, what if we pr_warn and put the device in blocking mode
>> since we need to prevent malicious DMAs.
> IMHO a WARN_ON is fine, and alerts to the possible machine corruption
>
> No need to do blocking, you should have a perfectly valid target DTE
> that represents the state the HW is expected to be in. Resolve the
> race by making it bin that state and move forwards.
What do you mean by "making it bin that state".
>>> On ordering, I don't know, is this OK?
>>>
>>> If you are leaving/entering nesting mode I think you have to write the
>>> [2] value in the right sequence, you don't want to have the viommu
>>> enabled unless the host page table is setup properly. So [2] is
>>> written last when enabling, and first when disabling. Flushes required
>>> after each write to ensure the HW doesn't see a cross-128 word bit
>>> tear.
>>>> GuestPagingMode also has to be sequenced correctly, the GCR3 table
>>> pointer should be invalid when it is changed, meaning you have to
>>> write it and flush before storing the GCR3 table, and the reverse to
>>> undo it.
>>>
>>> The ordering, including when DTE flushes are needed, is pretty
>>> hard. This is much simpler than, say, ARM, so I think you could open
>>> code it, but it should be a pretty sizable bit of logic to figure out
>>> what to do.
>> IOMMU hardware do not do partial interpret of the DTE and SW ensure DTE
>> flush after updating the DTE. Therefore, ordering should not be of a concern
>> here as long as the driver correctly program the entry.
> Even if the IOMMU HW does a perfect 256 bit atomic read you still have
> to order the CPU writes correctly. It just means you don't need to
> flush.
>
> The guidelines in "2.2.2.2 Making Device Table Entry Changes" make
> this clear. The indivudal CPU writes smaller than 256 bits have to be
> sequenced right.
For the interrupt remapping part, no special step is needed if we can
write do 64-bit write. Similary, for the address translation part, no
special step is needed if we can do 128-bit write.
> This section looks like it was written before translation bits were
> placed in the other 128 bit word - it assumes a single 128 bit write
> is always sufficient which isn't true anymore.
>
> So you still have the issue of having to decide if you write 128 bit
> [0] or [1] first.
The GuestPagingMode bit is in effect when GV=1. So, the higher 128-bit
(which contains GuestPagingMode bit) should be written first, and follow
by lower 128-bit (which contans GV bit).
Thanks,
Suravee
Powered by blists - more mailing lists