lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Jun 2020 01:56:21 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     "Liu, Yi L" <yi.l.liu@...el.com>,
        Alex Williamson <alex.williamson@...hat.com>
CC:     "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "joro@...tes.org" <joro@...tes.org>,
        "jacob.jun.pan@...ux.intel.com" <jacob.jun.pan@...ux.intel.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Tian, Jun J" <jun.j.tian@...el.com>,
        "Sun, Yi Y" <yi.y.sun@...el.com>,
        "jean-philippe@...aro.org" <jean-philippe@...aro.org>,
        "peterx@...hat.com" <peterx@...hat.com>,
        "Wu, Hao" <hao.wu@...el.com>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v2 02/15] iommu: Report domain nesting info

> From: Liu, Yi L <yi.l.liu@...el.com>
> Sent: Monday, June 15, 2020 2:05 PM
> 
> Hi Kevin,
> 
> > From: Tian, Kevin <kevin.tian@...el.com>
> > Sent: Monday, June 15, 2020 9:23 AM
> >
> > > From: Liu, Yi L <yi.l.liu@...el.com>
> > > Sent: Friday, June 12, 2020 5:05 PM
> > >
> > > Hi Alex,
> > >
> > > > From: Alex Williamson <alex.williamson@...hat.com>
> > > > Sent: Friday, June 12, 2020 3:30 AM
> > > >
> > > > On Thu, 11 Jun 2020 05:15:21 -0700
> > > > Liu Yi L <yi.l.liu@...el.com> wrote:
> > > >
> > > > > IOMMUs that support nesting translation needs report the
> > > > > capability info to userspace, e.g. the format of first level/stage paging
> > structures.
> > > > >
> > > > > Cc: Kevin Tian <kevin.tian@...el.com>
> > > > > CC: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> > > > > Cc: Alex Williamson <alex.williamson@...hat.com>
> > > > > Cc: Eric Auger <eric.auger@...hat.com>
> > > > > Cc: Jean-Philippe Brucker <jean-philippe@...aro.org>
> > > > > Cc: Joerg Roedel <joro@...tes.org>
> > > > > Cc: Lu Baolu <baolu.lu@...ux.intel.com>
> > > > > Signed-off-by: Liu Yi L <yi.l.liu@...el.com>
> > > > > Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> > > > > ---
> > > > > @Jean, Eric: as nesting was introduced for ARM, but looks like no
> > > > > actual user of it. right? So I'm wondering if we can reuse
> > > > > DOMAIN_ATTR_NESTING to retrieve nesting info? how about your
> > > opinions?
> > > > >
> > > > >  include/linux/iommu.h      |  1 +
> > > > >  include/uapi/linux/iommu.h | 34
> > > ++++++++++++++++++++++++++++++++++
> > > > >  2 files changed, 35 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > > > > 78a26ae..f6e4b49 100644
> > > > > --- a/include/linux/iommu.h
> > > > > +++ b/include/linux/iommu.h
> > > > > @@ -126,6 +126,7 @@ enum iommu_attr {
> > > > >  	DOMAIN_ATTR_FSL_PAMUV1,
> > > > >  	DOMAIN_ATTR_NESTING,	/* two stages of translation */
> > > > >  	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
> > > > > +	DOMAIN_ATTR_NESTING_INFO,
> > > > >  	DOMAIN_ATTR_MAX,
> > > > >  };
> > > > >
> > > > > diff --git a/include/uapi/linux/iommu.h
> > > > > b/include/uapi/linux/iommu.h index 303f148..02eac73 100644
> > > > > --- a/include/uapi/linux/iommu.h
> > > > > +++ b/include/uapi/linux/iommu.h
> > > > > @@ -332,4 +332,38 @@ struct iommu_gpasid_bind_data {
> > > > >  	};
> > > > >  };
> > > > >
> > > > > +struct iommu_nesting_info {
> > > > > +	__u32	size;
> > > > > +	__u32	format;
> > > > > +	__u32	features;
> > > > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
> > > > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
> > > > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 <<
> 2)
> > > > > +	__u32	flags;
> > > > > +	__u8	data[];
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * @flags:	VT-d specific flags. Currently reserved for future
> > > > > + *		extension.
> > > > > + * @addr_width:	The output addr width of first level/stage
> > translation
> > > > > + * @pasid_bits:	Maximum supported PASID bits, 0 represents
> no
> > > PASID
> > > > > + *		support.
> > > > > + * @cap_reg:	Describe basic capabilities as defined in VT-d
> > > capability
> > > > > + *		register.
> > > > > + * @cap_mask:	Mark valid capability bits in @cap_reg.
> > > > > + * @ecap_reg:	Describe the extended capabilities as defined in VT-d
> > > > > + *		extended capability register.
> > > > > + * @ecap_mask:	Mark the valid capability bits in @ecap_reg.
> > > >
> > > > Please explain this a little further, why do we need to tell
> > > > userspace about cap/ecap register bits that aren't valid through this
> interface?
> > > > Thanks,
> > >
> > > we only want to tell userspace about the bits marked in the
> cap/ecap_mask.
> > > cap/ecap_mask is kind of white-list of the cap/ecap register.
> > > userspace should only care about the bits in the white-list, for other
> > > bits, it should ignore.
> > >
> > > Regards,
> > > Yi Liu
> >
> > For invalid bits if kernel just clears them then do we still need additional
> mask bits
> > to explicitly mark them out? I guess this might be the point that Alex asked...
> 
> For invalid bits, kernel will clear them. But I think the mask bits is
> still necessary. The mask bits tells user space the bits related to
> nesting. Without it, user space may have no idea about it.

userspace should know which bit is related to nesting and then should
check that bit explicitly...

> 
> Maybe talk about QEMU usage of the cap/ecap bits would help. QEMU
> vIOMMU
> decides cap/ecap bits according to QEMU cmdline. But not all of them are
> compatible with hardware support. Especially, vIOMMU built on nesting.
> So needs to sync the cap/ecap bits with host side. Based on the mask
> bits, QEMU can compare the cap/ecap bits configured by QEMU cmdline with
> the cap/ecap bits reported by this interface. This comparation is limited
> to the nesting related bits in cap/ecap, the other bits are not included
> and can use the configuration by QEMU cmdline.

I didn't get this explanation. Based on patch [15/15], nesting capabilities
are defined as:
+/* Nesting Support Capability Alignment */
+#define VTD_CAP_FL1GP		(1ULL << 56)
+#define VTD_CAP_FL5LP		(1ULL << 60)
+#define VTD_ECAP_PRS		(1ULL << 29)
+#define VTD_ECAP_ERS		(1ULL << 30)
+#define VTD_ECAP_SRS		(1ULL << 31)
+#define VTD_ECAP_EAFS		(1ULL << 34)
+#define VTD_ECAP_PASID		(1ULL << 40)

When Qemu gets an cmdline option it knows which bit out of above
list should be checked against hardware capability. Then just do the
check bit-by-bit. Why do we need mask bit in uapi to tell which bits
are valid? Unless 0/1 doesn't represent validity of some bit. Do we
have such example?

> 
> The link below show the current Intel vIOMMU usage on the cap/ecap bits.
> For each assigned device, vIOMMU will compare the nesting related bits in
> cap/ecap and mask out the bits which hardware doesn't support. After the
> machine is intilized, the vIOMMU cap/ecap bits are determined. If user
> hot-plug devices to VM, vIOMMU will fail it if the hardware cap/ecap bits
> behind hot-plug device are not compatible with determined vIOMMU
> cap/ecap
> bits.
> 
> https://www.spinics.net/lists/kvm/msg218294.html
> 
> Regards,
> Yi Liu
> 
> > >
> > > > Alex
> > > >
> > > >
> > > > > + */
> > > > > +struct iommu_nesting_info_vtd {
> > > > > +	__u32	flags;
> > > > > +	__u16	addr_width;
> > > > > +	__u16	pasid_bits;
> > > > > +	__u64	cap_reg;
> > > > > +	__u64	cap_mask;
> > > > > +	__u64	ecap_reg;
> > > > > +	__u64	ecap_mask;
> > > > > +};
> > > > > +
> > > > >  #endif /* _UAPI_IOMMU_H */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ