lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58169757-69f5-45fc-8657-82763c71803b@amd.com>
Date: Fri, 30 Aug 2024 01:15:43 +0700
From: "Suthikulpanit, Suravee" <suravee.suthikulpanit@....com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: linux-kernel@...r.kernel.org, iommu@...ts.linux.dev, joro@...tes.org,
 robin.murphy@....com, vasant.hegde@....com, kevin.tian@...el.com,
 jon.grimm@....com, santosh.shukla@....com, pandoh@...gle.com,
 kumaranand@...gle.com
Subject: Re: [PATCH] iommu/amd: Modify set_dte_entry() to use 128-bit cmpxchg
 operation

Hi Jason,

On 8/20/2024 12:09 AM, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2024 at 04:18:39PM +0000, Suravee Suthikulpanit wrote:
> 
>> +struct dte256 {
> 
> [..]
> 
> This would be alot better as just
> 
> struct dte {
>   union {
>       u64 data[4];
>       u128 data128[2];
>   };
> };
> 
> And don't make a new type. IIRC the cmpxchg likes to have alignment
> and the u128 provides that..

OK.

>> +static void update_dte256(struct amd_iommu *iommu, u16 devid, struct dte256 *new)
>> +{
>> +	struct dev_table_entry *dev_table = get_dev_table(iommu);
>> +	struct dte256 *ptr = (struct dte256 *)&dev_table[devid];
>> +	struct dte256 old = {
>> +		.qw_lo.data = ptr->qw_lo.data,
>> +		.qw_hi.data = ptr->qw_hi.data,
>> +	};
>> +
>> +	/* Update qw_lo */
>> +	if (!try_cmpxchg128(&ptr->qw_lo.data, &old.qw_lo.data, new->qw_lo.data))
>> +		goto err_out;
>> +
>> +	/* Update qw_hi */
>> +	if (!try_cmpxchg128(&ptr->qw_hi.data, &old.qw_hi.data, new->qw_hi.data)) {
>> +		/* Restore qw_lo */
>> +		try_cmpxchg128(&ptr->qw_lo.data, &new->qw_lo.data, old.qw_lo.data);
>> +		goto err_out;
>> +	}
> 
> I don't think this is going to work like this, the interrupt remapping
> code can asynchronously change the values (it is an existing race bug)
> and if it races that will cause these to fail too, and this doesn't
> handle that case at all.
> 
> IMHO the simple solution would hold the spinlock and transfer the
> interrupt remapping fields then directly write them without a failable
> cmpxchg. Maybe you could also use a cmpxchg loop to transfer the interrupt
> remapping bits without a lock..

OK, I'll as synchronization when updating DTE.

> Can you assume cmpxchg 128 is available on all CPUs that have the
> iommu? I see google says some early AMD 64 bit CPUs don't have it? Do
> they have iommu's? I had wondered if the doc intended that some 128
> bit SSE/MMX store store would be used here??
> 
> If cmpxchg128 is safe checking that cap and refusing to probe the
> iommu driver would be appropriate. Otherwise a 64 bit flow still has
> to work?

According to the AMD BIOS and Kernel Developer's Guide (BDKG) dated back 
since family 10h Processor 
(https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf), 
which is the first introduction of AMD IOMMU, AMD processor always 
support for CPUID Fn0000_0001_ECX[CMPXCHG16B]=1.

So, it should be safe to assume that cmpxchg128 is available with all 
AMD processor w/ IOMMU.

> Finally, the order of programming the two 128s depends on what you are
> programming. The first 128 contains the valid bit so you shouldn't be
> writing the second 128 after, and vice versa. This will make
> undesirable races around Guest Paging Mode and eventually viommuen at
> least.

Ok, I'll program the higher 128-bit first.

> If you have to solve the 64 bit case too, then you really should use
> the programmer from SMMUv3. Everyone seems to need this logic and it
> should be generalized. I can help you do it.. I have a patch to make
> that logic use 128 bit stores too someplace.
> 
> Otherwise you can possibly open code the 128 bit path with some more
> care to check the valid bits and set the order, plus the interrupt
> remapping locking.

I think the cmpxchg128 should still work and would be a simpler 
solution. I am sending out v2 shortly.

Thanks,
Suravee

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ