linux-kernel - Re: [PATCH v4 09/12] iommu/vt-d: Add iotlb flush for nested domain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZNsYxta9Pi7USDoR@Asurada-Nvidia>
Date:   Mon, 14 Aug 2023 23:18:46 -0700
From:   Nicolin Chen <nicolinc@...dia.com>
To:     Jason Gunthorpe <jgg@...dia.com>, "Liu, Yi L" <yi.l.liu@...el.com>,
        "Tian, Kevin" <kevin.tian@...el.com>
CC:     "joro@...tes.org" <joro@...tes.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "cohuck@...hat.com" <cohuck@...hat.com>,
        "eric.auger@...hat.com" <eric.auger@...hat.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
        "chao.p.peng@...ux.intel.com" <chao.p.peng@...ux.intel.com>,
        "yi.y.sun@...ux.intel.com" <yi.y.sun@...ux.intel.com>,
        "peterx@...hat.com" <peterx@...hat.com>,
        "jasowang@...hat.com" <jasowang@...hat.com>,
        "shameerali.kolothum.thodi@...wei.com" 
        <shameerali.kolothum.thodi@...wei.com>,
        "lulu@...hat.com" <lulu@...hat.com>,
        "suravee.suthikulpanit@....com" <suravee.suthikulpanit@....com>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
        "Duan, Zhenzhong" <zhenzhong.duan@...el.com>
Subject: Re: [PATCH v4 09/12] iommu/vt-d: Add iotlb flush for nested domain

On Fri, Aug 11, 2023 at 09:45:21AM -0700, Nicolin Chen wrote:

> > But if stepping back a bit supporting an array-based non-native format
> > could simplify the uAPI design and allows code sharing for array among
> > vendor drivers. You can still keep the entry as native format then the
> > only difference with future in-kernel fast path is just on walking an array
> > vs. walking a ring. And VMM doesn't need to expose non-invalidate
> > cmds to the kernel and then be skipped.
> 
> Ah, so we might still design the uAPI to be ring based at this
> moment, yet don't support a case CONS > 0 to leave that to an
> upgrade in the future.
> 
> I will try estimating a bit how complicated to implement the
> ring, to see if we could just start with that. Otherwise, will
> just start with an array.

I drafted a uAPI structure for a ring-based SW queue. While I am
trying an implementation, I'd like to collect some comments at the
structure, to see if it overall makes sense.

One thing that I couldn't add to this common structure for SMMU
is the hardware error code, which should be encoded in the higher
bits of the consumer index register, following the SMMU spec:
    ERR, bits [30:24] Error reason code.
    - When a command execution error is detected, ERR is set to a
      reason code and then the SMMU_GERROR.CMDQ_ERR global error
      becomes active.
    - The value in this field is UNKNOWN when the CMDQ_ERR global
      error is not active. This field resets to an UNKNOWN value.

But, I feel it odd to do the same to the generic structure. So,
perhaps an optional @out_error can be added to this structure. Or
some other idea?

Thanks
Nic

/**
 * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE)
 * @size: sizeof(struct iommu_hwpt_invalidate)
 * @hwpt_id: HWPT ID of target hardware page table for the invalidation
 * @q_uptr: User pointer to an invalidation queue, which can be used as a flat
 *          array or a circular ring queue. The entiry(s) in the queue must be
 *          at a fixed width @q_entry_len, containing a user data structure for
 *          an invalidation request, specific to the given hardware pagetable.
 * @q_cons_uptr: User pointer to the consumer index (with its wrap flag) of an
 *               invalidation queue. This pointer must point to a __u32 type of
 *               memory location. The consumer index tells kernel to read from
 *               the entry pointed by it (and then its next entry) until kernel
 *               reaches the entry pointed by the producer index @q_prod, and
 *               allows kernel to update the consumer index to where it stops:
 *               on success, it should be updated to @q_prod; otherwise, to the
 *               index pointing to the failed entry.
 * @q_prod: Producer index (with its wrap flag) of an invalidation queue. This
 *          index points to the entry next to the last requested entry in the
 *          invalidation queue. In case of using the queue as a flat array, it
 *          equals to the number of entries @q_entry_num.
 * @q_index_bits: Effective bits of both indexes. Defines the maximum value an
 *                index can be. Must not be greater than 31 bits. A wrap flag
 *                is defined at the next higher bit adjacent to the index bits:
 *                e.g. if @q_index_bits is 20, @q_prod[19:0] are the index bits
 *                and @q_prod[20] is the wrap flag. The wrap flag, acting like
 *                a sign flag, must be toggled each time an index overflow and
 *                wraps to the lower end of the circular queue.
 * @q_entry_num: Totaly number of the entries in an invalidation queue
 * @q_entry_len: Length (in bytes) of an entry of an invalidation queue
 *
 * Invalidate the iommu cache for user-managed page table. Modifications on a
 * user-managed page table should be followed by this operation to sync cache.
 *
 * One request supports multiple invalidations via a multi-entry queue:
 *   |<----------- Length of Queue = @q_entry_num * @q_entry_len ------------>|
 *   --------------------------------------------------------------------------
 *   | 0 | 1 | 2 | 3 | ... | @q_entry_num-3 | @q_entry_num-2 | @q_entry_num-1 |
 *   --------------------------------------------------------------------------
 *   ^           ^                          ^                |<-@...ntry_len->|
 *   |           |                          |
 * @q_uptr  @q_cons_uptr                 @q_prod
 *
 * A queue index can wrap its index bits off the high end of the queue and back
 * onto the low end by toggling its wrap flag: e.g. when @q_entry_num=0x10 and
 * @q_index_bits=4, *@...ons_uptr=0xf and @q_prod=0x11 inputs mean the producer
 * index is wrapped to 0x1 with its wrap flag set, so kernel reads/handles the
 * entry starting from by the consumer index (0xf) and wraps it back to 0x0 and
 * 0x1 by toggling the wrap flag, i.e. *@...ons_uptr has a final value of 0x11.
 */
struct iommu_hwpt_invalidate {
	__u32 size;
	__u32 hwpt_id;
	__aligned_u64 q_uptr;
	__aligned_u64 q_cons_uptr;
	__u32 q_prod;
	__u32 q_index_bits;
	__u32 q_entry_num;
	__u32 q_entry_len;
};
#define IOMMU_HWPT_INVALIDATE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_INVALIDATE)