lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BL1PR11MB5271D3D84F93BA16852233F98CAB2@BL1PR11MB5271.namprd11.prod.outlook.com>
Date: Thu, 25 Jul 2024 00:44:05 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jacob Pan <jacob.jun.pan@...ux.intel.com>
CC: "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, LKML
	<linux-kernel@...r.kernel.org>, Lu Baolu <baolu.lu@...ux.intel.com>, "Liu, Yi
 L" <yi.l.liu@...el.com>, "Zhang, Tina" <tina.zhang@...el.com>, "Kumar, Sanjay
 K" <sanjay.k.kumar@...el.com>
Subject: RE: [PATCH] iommu/vt-d: Fix potential soft lockup due to reclaim

> From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> Sent: Thursday, July 25, 2024 12:26 AM
> 
> On Wed, 24 Jul 2024 07:40:25 +0000, "Tian, Kevin" <kevin.tian@...el.com>
> wrote:
> 
> > > From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> > > Sent: Saturday, July 20, 2024 2:17 AM
> > >
> > > Currently, there is no impact by this bug on the existing users because
> > > no callers are submitting invalidations with 0 descriptors.
> >
> > bug fix is for existing users. Please revise the subject line and this msg
> > to make it clear that it's for preparation of a new usage.
> The bug is in the qi_submit_sync function itself since it permits callers
> to give 0 as count. It is a bug regardless of users.
> 
> I put "potential" in the subject line to indicate, perhaps it is too vague.
> How about just stating what it is fixing:
> "Fix potential lockup if qi_submit_sync called with 0 count"
> 
> Also change this paragraph to:
> "Currently, there is no impact by this bug on the existing users because
>  no callers are submitting invalidations with 0 descriptors. This fix will
>  enable future users (such as DMA drain) calling qi_submit_sync() with 0
>  count."

Then please move it to the start.

> > > @@ -1463,8 +1462,14 @@ int qi_submit_sync(struct intel_iommu
> *iommu,
> > > struct qi_desc *desc,
> > >  		raw_spin_lock(&qi->q_lock);
> > >  	}
> > >
> > > -	for (i = 0; i < count; i++)
> > > -		qi->desc_status[(index + i) % QI_LENGTH] = QI_DONE;
> > > +	/*
> > > +	 * The reclaim code can free descriptors from multiple
> > > submissions
> > > +	 * starting from the tail of the queue. When count == 0, the
> > > +	 * status of the standalone wait descriptor at the tail of the
> > > queue
> > > +	 * must be set to QI_TO_BE_FREED to allow the reclaim code to
> > > proceed.
> > > +	 */
> > > +	for (i = 0; i <= count; i++)
> > > +		qi->desc_status[(index + i) % QI_LENGTH] =
> > > QI_TO_BE_FREED;
> >
> > We don't really need a new flag. Just set them to QI_FREE and then
> > reclaim QI_FREE slots until hitting qi->head in reclaim_free_desc().
> We do need to have a separate state for descriptors pending to be freed.
> Otherwise, reclaim code will advance pass the intended range.
> 

The commit msg said that QI_DONE is currently used for conflicting purpose.

Using QI_FREE keeps it only for signaling that a wait desc is completed.

The key is that reclaim() should not change a desc's state before it's
consumed by the owner. Instead we always let the owner to change the
state and reclaim() only does scan and adjust the tracking fields then
such race condition disappears.

In this example T2's slots are changed to QI_FREE by T2 after it completes
all the checks. Only at this point those slots can be reclaimed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ