lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 17 Jun 2024 14:05:10 -0700
From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Lu Baolu <baolu.lu@...ux.intel.com>, LKML
 <linux-kernel@...r.kernel.org>, "iommu@...ts.linux.dev"
 <iommu@...ts.linux.dev>, "Liu, Yi L" <yi.l.liu@...el.com>, "Kumar, Sanjay
 K" <sanjay.k.kumar@...el.com>, jacob.jun.pan@...ux.intel.com
Subject: Re: [PATCH] iommu/vt-d: Handle volatile descriptor status read


On Mon, 17 Jun 2024 03:04:36 +0000, "Tian, Kevin" <kevin.tian@...el.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> > Sent: Saturday, June 8, 2024 1:38 AM
> > 
> > Queued invalidation wait descriptor status is volatile in that IOMMU
> > hardware
> > writes the data upon completion.
> > 
> > Use READ_ONCE() to prevent compiler optimizations which ensures memory
> > reads every time. As a side effect, READ_ONCE() also enforces strict
> > types and
> > may add an extra instruction. But it should not have negative
> > performance impact since we use cpu_relax anyway and the extra time(by
> > adding an instruction) may allow IOMMU HW request cacheline ownership
> > easier.  
> 
> I didn't get the meaning of the last sentence.
The wait descriptor is polled by the CPU and written by the IOMMU
concurrently. The IOMMU needs to have the cacheline ownership before
writing the status data to signal completion of the wait descriptor.

If the CPU polling loop is very tight, it might make IOMMU request for
ownership contentious/difficult. Since we already use pause (cpu_relax())
to ease the contention, adding an additional instruction

 mov    (%rax),%eax

Will make the cacheline even less contentious since it is just register mov,
no memory access.

> > 
> > e.g. gcc 12.3
> > BEFORE:
> > 	81 38 ad de 00 00       cmpl   $0x2,(%rax)
> > 
> > AFTER (with READ_ONCE())
> >     772f:       8b 00                   mov    (%rax),%eax
> >     7731:       3d ad de 00 00          cmp    $0x2,%eax //status data
> > is 32 bit
> > 
> > Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>  
> 
> Do we need a fix tag here?
I cannot find the exact commit, this is really old code.

> otherwise looks good to me:
> 
> Reviewed-by: Kevin Tian <kevin.tian@...el.com>


Thanks,

Jacob

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ