[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250702150624.00007ceb@huawei.com>
Date: Wed, 2 Jul 2025 15:06:24 +0100
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Lorenzo Pieralisi <lpieralisi@...nel.org>
CC: Marc Zyngier <maz@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, "Rob
Herring" <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, "Conor
Dooley" <conor+dt@...nel.org>, Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Arnd Bergmann <arnd@...db.de>, "Sascha
Bischoff" <sascha.bischoff@....com>, Timothy Hayes <timothy.hayes@....com>,
Bjorn Helgaas <bhelgaas@...gle.com>, "Liam R. Howlett"
<Liam.Howlett@...cle.com>, Peter Maydell <peter.maydell@...aro.org>, "Mark
Rutland" <mark.rutland@....com>, Jiri Slaby <jirislaby@...nel.org>,
<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<devicetree@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v6 28/31] irqchip/gic-v5: Add GICv5 ITS support
On Thu, 26 Jun 2025 12:26:19 +0200
Lorenzo Pieralisi <lpieralisi@...nel.org> wrote:
> The GICv5 architecture implements Interrupt Translation Service
> (ITS) components in order to translate events coming from peripherals
> into interrupt events delivered to the connected IRSes.
>
> Events (ie MSI memory writes to ITS translate frame), are translated
> by the ITS using tables kept in memory.
>
> ITS translation tables for peripherals is kept in memory storage
> (device table [DT] and Interrupt Translation Table [ITT]) that
> is allocated by the driver on boot.
>
> Both tables can be 1- or 2-level; the structure is chosen by the
> driver after probing the ITS HW parameters and checking the
> allowed table splits and supported {device/event}_IDbits.
>
> DT table entries are allocated on demand (ie when a device is
> probed); the DT table is sized using the number of supported
> deviceID bits in that that's a system design decision (ie the
> number of deviceID bits implemented should reflect the number
> of devices expected in a system) therefore it makes sense to
> allocate a DT table that can cater for the maximum number of
> devices.
>
> DT and ITT tables are allocated using the kmalloc interface;
> the allocation size may be smaller than a page or larger,
> and must provide contiguous memory pages.
>
> LPIs INTIDs backing the device events are allocated one-by-one
> and only upon Linux IRQ allocation; this to avoid preallocating
> a large number of LPIs to cover the HW device MSI vector
> size whereas few MSI entries are actually enabled by a device.
>
> ITS cacheability/shareability attributes are programmed
> according to the provided firmware ITS description.
>
> The GICv5 partially reuses the GICv3 ITS MSI parent infrastructure
> and adds functions required to retrieve the ITS translate frame
> addresses out of msi-map and msi-parent properties to implement
> the GICv5 ITS MSI parent callbacks.
>
> Co-developed-by: Sascha Bischoff <sascha.bischoff@....com>
> Signed-off-by: Sascha Bischoff <sascha.bischoff@....com>
> Co-developed-by: Timothy Hayes <timothy.hayes@....com>
> Signed-off-by: Timothy Hayes <timothy.hayes@....com>
> Signed-off-by: Lorenzo Pieralisi <lpieralisi@...nel.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Marc Zyngier <maz@...nel.org>
Hi Lorenzo,
It almost certainly doesn't matter, but there are a couple of release
paths in here where things don't happen in the same order as the
equivalent error tear down paths (i.e. not reverse of setup).
There may well be a good reason for that but I couldn't immediately
spot what it was. Also a follow up similar to earlier comment about
the table sizing code not matching the comments above it. Same thing
going on here.
Jonathan
git a/drivers/irqchip/irq-gic-v5-its.c b/drivers/irqchip/irq-gic-v5-its.c
> new file mode 100644
> index 000000000000..cba632eb0273
> --- /dev/null
> +++ b/drivers/irqchip/irq-gic-v5-its.c
> +/*
> + * Function to check whether the device table or ITT table support
> + * a two-level table and if so depending on the number of id_bits
> + * requested, determine whether a two-level table is required.
> + *
> + * Return the 2-level size value if a two level table is deemed
> + * necessary.
> + */
> +static bool gicv5_its_l2sz_two_level(bool devtab, u32 its_idr1, u8 id_bits, u8 *sz)
> +{
> + unsigned int l2_bits, l2_sz;
> +
> + if (devtab && !FIELD_GET(GICV5_ITS_IDR1_DT_LEVELS, its_idr1))
> + return false;
> +
> + if (!devtab && !FIELD_GET(GICV5_ITS_IDR1_ITT_LEVELS, its_idr1))
> + return false;
> +
> + /*
> + * Pick an L2 size that matches the pagesize; if a match
> + * is not found, go for the smallest supported l2 size granule.
Similar to before, this description is confusing. If Page size is 64K
and 16 + 4 are supported we choose 16 which is not he smallest
supported (4 is). The condition the comment refers to only applies
if only larger than pagesized things are supported.
> + *
> + * This ensures that we will always be able to allocate
> + * contiguous memory at L2.
> + */
> + switch (PAGE_SIZE) {
> + case SZ_64K:
> + if (GICV5_ITS_IDR1_L2SZ_SUPPORT_64KB(its_idr1)) {
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_64k;
> + break;
> + }
> + fallthrough;
> + case SZ_16K:
> + if (GICV5_ITS_IDR1_L2SZ_SUPPORT_16KB(its_idr1)) {
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_16k;
> + break;
> + }
> + fallthrough;
> + case SZ_4K:
> + if (GICV5_ITS_IDR1_L2SZ_SUPPORT_4KB(its_idr1)) {
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_4k;
> + break;
> + }
> + if (GICV5_ITS_IDR1_L2SZ_SUPPORT_16KB(its_idr1)) {
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_16k;
> + break;
> + }
> + if (GICV5_ITS_IDR1_L2SZ_SUPPORT_64KB(its_idr1)) {
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_64k;
> + break;
> + }
> +
> + l2_sz = GICV5_ITS_DT_ITT_CFGR_L2SZ_4k;
> + break;
> + }
> +
> + l2_bits = gicv5_its_l2sz_to_l2_bits(l2_sz);
> +
> + if (l2_bits > id_bits)
> + return false;
> +
> + *sz = l2_sz;
> +
> + return true;
> +}
> +/*
> + * Register a new device in the device table. Allocate an ITT and
> + * program the L2DTE entry according to the ITT structure that
> + * was chosen.
> + */
> +static int gicv5_its_device_register(struct gicv5_its_chip_data *its,
> + struct gicv5_its_dev *its_dev)
> +{
> + u8 event_id_bits, device_id_bits, itt_struct, itt_l2sz;
> + phys_addr_t itt_phys_base;
> + bool two_level_itt;
> + u32 idr1, idr2;
> + __le64 *dte;
> + u64 val;
> + int ret;
> +
> + device_id_bits = devtab_cfgr_field(its, DEVICEID_BITS);
> +
> + if (its_dev->device_id >= BIT(device_id_bits)) {
> + pr_err("Supplied DeviceID (%u) outside of Device Table range (%u)!",
> + its_dev->device_id, (u32)GENMASK(device_id_bits - 1, 0));
> + return -EINVAL;
> + }
> +
> + dte = gicv5_its_devtab_get_dte_ref(its, its_dev->device_id, true);
> + if (!dte)
> + return -ENOMEM;
> +
> + if (FIELD_GET(GICV5_DTL2E_VALID, le64_to_cpu(*dte)))
> + return -EBUSY;
> +
> + /*
> + * Determine how many bits we need, validate those against the max.
> + * Based on these, determine if we should go for a 1- or 2-level ITT.
> + */
> + event_id_bits = order_base_2(its_dev->num_events);
> +
> + idr2 = its_readl_relaxed(its, GICV5_ITS_IDR2);
> +
> + if (event_id_bits > FIELD_GET(GICV5_ITS_IDR2_EVENTID_BITS, idr2)) {
> + pr_err("Required EventID bits (%u) larger than supported bits (%u)!",
> + event_id_bits,
> + (u8)FIELD_GET(GICV5_ITS_IDR2_EVENTID_BITS, idr2));
> + return -EINVAL;
> + }
> +
> + idr1 = its_readl_relaxed(its, GICV5_ITS_IDR1);
> +
> + /*
> + * L2 ITT size is programmed into the L2DTE regardless of
> + * whether a two-level or linear ITT is built, init it.
> + */
> + itt_l2sz = 0;
> +
> + two_level_itt = gicv5_its_l2sz_two_level(false, idr1, event_id_bits,
> + &itt_l2sz);
> + if (two_level_itt)
> + ret = gicv5_its_create_itt_two_level(its, its_dev, event_id_bits,
> + itt_l2sz,
> + its_dev->num_events);
> + else
> + ret = gicv5_its_create_itt_linear(its, its_dev, event_id_bits);
> + if (ret)
> + return ret;
> +
> + itt_phys_base = two_level_itt ? virt_to_phys(its_dev->itt_cfg.l2.l1itt) :
> + virt_to_phys(its_dev->itt_cfg.linear.itt);
> +
> + itt_struct = two_level_itt ? GICV5_ITS_DT_ITT_CFGR_STRUCTURE_TWO_LEVEL :
> + GICV5_ITS_DT_ITT_CFGR_STRUCTURE_LINEAR;
> +
> + val = FIELD_PREP(GICV5_DTL2E_EVENT_ID_BITS, event_id_bits) |
> + FIELD_PREP(GICV5_DTL2E_ITT_STRUCTURE, itt_struct) |
> + (itt_phys_base & GICV5_DTL2E_ITT_ADDR_MASK) |
> + FIELD_PREP(GICV5_DTL2E_ITT_L2SZ, itt_l2sz) |
> + FIELD_PREP(GICV5_DTL2E_VALID, 0x1);
> +
> + its_write_table_entry(its, dte, val);
> +
> + ret = gicv5_its_device_cache_inv(its, its_dev);
> + if (ret) {
> + gicv5_its_free_itt(its_dev);
> + its_write_table_entry(its, dte, 0);
If it makes no difference, unwind in reverse order of setup so swap the
two lines above.
> + return ret;
> + }
> +
> + return 0;
> +}
> +static struct gicv5_its_dev *gicv5_its_alloc_device(struct gicv5_its_chip_data *its, int nvec,
> + u32 dev_id)
> +{
> + struct gicv5_its_dev *its_dev;
> + void *entry;
> + int ret;
> +
> + its_dev = gicv5_its_find_device(its, dev_id);
> + if (!IS_ERR(its_dev)) {
> + pr_err("A device with this DeviceID (0x%x) has already been registered.\n",
> + dev_id);
> +
> + return ERR_PTR(-EBUSY);
> + }
> +
> + its_dev = kzalloc(sizeof(*its_dev), GFP_KERNEL);
> + if (!its_dev)
> + return ERR_PTR(-ENOMEM);
> +
> + its_dev->device_id = dev_id;
> + its_dev->num_events = nvec;
> +
> + ret = gicv5_its_device_register(its, its_dev);
> + if (ret) {
> + pr_err("Failed to register the device\n");
> + goto out_dev_free;
> + }
> +
> + gicv5_its_device_cache_inv(its, its_dev);
> +
> + its_dev->its_node = its;
> +
> + its_dev->event_map = (unsigned long *)bitmap_zalloc(its_dev->num_events, GFP_KERNEL);
> + if (!its_dev->event_map) {
> + ret = -ENOMEM;
> + goto out_unregister;
> + }
> +
> + entry = xa_store(&its->its_devices, dev_id, its_dev, GFP_KERNEL);
> + if (xa_is_err(entry)) {
> + ret = xa_err(entry);
> + goto out_bitmap_free;
> + }
> +
> + return its_dev;
> +
> +out_bitmap_free:
> + bitmap_free(its_dev->event_map);
> +out_unregister:
> + gicv5_its_device_unregister(its, its_dev);
> +out_dev_free:
> + kfree(its_dev);
> + return ERR_PTR(ret);
> +}
> +
> +static int gicv5_its_msi_prepare(struct irq_domain *domain, struct device *dev,
> + int nvec, msi_alloc_info_t *info)
> +{
> + u32 dev_id = info->scratchpad[0].ul;
> + struct msi_domain_info *msi_info;
> + struct gicv5_its_chip_data *its;
> + struct gicv5_its_dev *its_dev;
> +
> + msi_info = msi_get_domain_info(domain);
> + its = msi_info->data;
> +
> + guard(mutex)(&its->dev_alloc_lock);
> +
> + its_dev = gicv5_its_alloc_device(its, nvec, dev_id);
> + if (IS_ERR(its_dev))
> + return PTR_ERR(its_dev);
> +
> + its_dev->its_trans_phys_base = info->scratchpad[1].ul;
> + info->scratchpad[0].ptr = its_dev;
> +
> + return 0;
> +}
> +
> +static void gicv5_its_msi_teardown(struct irq_domain *domain, msi_alloc_info_t *info)
> +{
> + struct gicv5_its_dev *its_dev = info->scratchpad[0].ptr;
> + struct msi_domain_info *msi_info;
> + struct gicv5_its_chip_data *its;
> +
> + msi_info = msi_get_domain_info(domain);
> + its = msi_info->data;
> +
> + guard(mutex)(&its->dev_alloc_lock);
> +
> + if (WARN_ON_ONCE(!bitmap_empty(its_dev->event_map, its_dev->num_events)))
> + return;
> +
> + gicv5_its_device_unregister(its, its_dev);
> + bitmap_free(its_dev->event_map);
> + xa_erase(&its->its_devices, its_dev->device_id);
I was expecting this to be in reverse order of what happens in *msi_prepare (and *msi_alloc under
that). That would give the order
xa_erase();
bitmap_free();
gicv5_its_device_unregister();
kfree(its_dev);
If there is a reason for this ordering it might be good to add a comment calling it out.
> + kfree(its_dev);
> +}
Powered by blists - more mailing lists