[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b02f2e6e-5ad2-2e7b-86a5-644f44ecdb6d@amd.com>
Date: Wed, 18 Sep 2024 18:31:27 +0100
From: Alejandro Lucero Palau <alucerop@....com>
To: Wei Huang <wei.huang2@....com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
netdev@...r.kernel.org
Cc: Jonathan.Cameron@...wei.com, helgaas@...nel.org, corbet@....net,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, alex.williamson@...hat.com, gospo@...adcom.com,
michael.chan@...adcom.com, ajit.khaparde@...adcom.com,
somnath.kotur@...adcom.com, andrew.gospodarek@...adcom.com,
manoj.panicker2@....com, Eric.VanTassell@....com, vadim.fedorenko@...ux.dev,
horms@...nel.org, bagasdotme@...il.com, bhelgaas@...gle.com,
lukas@...ner.de, paul.e.luse@...el.com, jing2.liu@...el.com
Subject: Re: [PATCH V4 11/12] bnxt_en: Add TPH support in BNXT driver
On 9/16/24 19:55, Wei Huang wrote:
>
>
> On 9/11/24 10:37 AM, Alejandro Lucero Palau wrote:
>>
>> On 8/22/24 21:41, Wei Huang wrote:
>>> From: Manoj Panicker <manoj.panicker2@....com>
>>>
>>> Implement TPH support in Broadcom BNXT device driver. The driver uses
>>> TPH functions to retrieve and configure the device's Steering Tags when
>>> its interrupt affinity is being changed.
>>>
>>> Co-developed-by: Wei Huang <wei.huang2@....com>
>>> Signed-off-by: Wei Huang <wei.huang2@....com>
>>> Signed-off-by: Manoj Panicker <manoj.panicker2@....com>
>>> Reviewed-by: Ajit Khaparde <ajit.khaparde@...adcom.com>
>>> Reviewed-by: Somnath Kotur <somnath.kotur@...adcom.com>
>>> Reviewed-by: Andy Gospodarek <andrew.gospodarek@...adcom.com>
>>> ---
>>> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 78
>>> +++++++++++++++++++++++
>>> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++
>>> 2 files changed, 82 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> index ffa74c26ee53..5903cd36b54d 100644
>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
>>> @@ -55,6 +55,7 @@
>>> #include <net/page_pool/helpers.h>
>>> #include <linux/align.h>
>>> #include <net/netdev_queues.h>
>>> +#include <linux/pci-tph.h>
>>> #include "bnxt_hsi.h"
>>> #include "bnxt.h"
>>> @@ -10821,6 +10822,58 @@ int bnxt_reserve_rings(struct bnxt *bp,
>>> bool irq_re_init)
>>> return 0;
>>> }
>>> +static void __bnxt_irq_affinity_notify(struct
>>> irq_affinity_notify *notify,
>>> + const cpumask_t *mask)
>>> +{
>>> + struct bnxt_irq *irq;
>>> + u16 tag;
>>> +
>>> + irq = container_of(notify, struct bnxt_irq, affinity_notify);
>>> + cpumask_copy(irq->cpu_mask, mask);
>>> +
>>> + if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
>>> + cpumask_first(irq->cpu_mask), &tag))
>>
>>
>> I understand just one cpu from the mask has to be used, but I wonder if
>> some check should be done for ensuring the mask is not mad.
>>
>> This is control path and the related queue is going to be restarted, so
>> maybe a sanity check for ensuring all the cpus in the mask are from the
>> same CCX complex?
>
> I don't think this is always true and we shouldn't warn when this
> happens. There is only one ST can be supported, so the driver need to
> make a good judgement on which ST to be used. But no matter what, ST
> is just a hint - it shouldn't cause any correctness issues in HW, even
> when it is not the optimal target CPU. So warning is unnecessary.
>
1) You can use a "mad" mask for avoiding a specific interrupt to disturb
a specific execution is those cores not part of the mask. But I argue
the ST hint should not be set then.
2) Someone, maybe an automatic script, could try to get the best
performance possible, and a "mad" mask could preclude such outcome
inadvertently.
I agree a warning could not be a good idea because 1, but I would say
adding some way of traceability here could be interesting. A tracepoint
or a new ST field for last hint set for that interrupt/queue.
>>
>> That would be an iteration checking the tag is the same one for all of
>> them. If not, at least a warning stating the tag/CCX/cpu used.
>>
>>
>>> + return;
>>> +
>>> + if (pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag))
>>> + return;
>>> +
>>> + if (netif_running(irq->bp->dev)) {
>>> + rtnl_lock();
>>> + bnxt_close_nic(irq->bp, false, false);
>>> + bnxt_open_nic(irq->bp, false, false);
>>> + rtnl_unlock();
>>> + }
>>> +}
>>> +
>>> +static void __bnxt_irq_affinity_release(struct kref __always_unused
>>> *ref)
>>> +{
>>> +}
>>> +
>>> +static void bnxt_release_irq_notifier(struct bnxt_irq *irq)
>>> +{
>>> + irq_set_affinity_notifier(irq->vector, NULL);
>>> +}
>>> +
>>> +static void bnxt_register_irq_notifier(struct bnxt *bp, struct
>>> bnxt_irq *irq)
>>> +{
>>> + struct irq_affinity_notify *notify;
>>> +
>>> + /* Nothing to do if TPH is not enabled */
>>> + if (!pcie_tph_enabled(bp->pdev))
>>> + return;
>>> +
>>> + irq->bp = bp;
>>> +
>>> + /* Register IRQ affinility notifier */
>>> + notify = &irq->affinity_notify;
>>> + notify->irq = irq->vector;
>>> + notify->notify = __bnxt_irq_affinity_notify;
>>> + notify->release = __bnxt_irq_affinity_release;
>>> +
>>> + irq_set_affinity_notifier(irq->vector, notify);
>>> +}
>>> +
>>> static void bnxt_free_irq(struct bnxt *bp)
>>> {
>>> struct bnxt_irq *irq;
>>> @@ -10843,11 +10896,17 @@ static void bnxt_free_irq(struct bnxt *bp)
>>> free_cpumask_var(irq->cpu_mask);
>>> irq->have_cpumask = 0;
>>> }
>>> +
>>> + bnxt_release_irq_notifier(irq);
>>> +
>>> free_irq(irq->vector, bp->bnapi[i]);
>>> }
>>> irq->requested = 0;
>>> }
>>> +
>>> + /* Disable TPH support */
>>> + pcie_disable_tph(bp->pdev);
>>> }
>>> static int bnxt_request_irq(struct bnxt *bp)
>>> @@ -10870,6 +10929,13 @@ static int bnxt_request_irq(struct bnxt *bp)
>>> if (!(bp->flags & BNXT_FLAG_USING_MSIX))
>>> flags = IRQF_SHARED;
>>> + /* Enable TPH support as part of IRQ request */
>>> + if (pcie_tph_modes(bp->pdev) & PCI_TPH_CAP_INT_VEC) {
>>> + rc = pcie_enable_tph(bp->pdev, PCI_TPH_CAP_INT_VEC);
>>> + if (rc)
>>> + netdev_warn(bp->dev, "failed enabling TPH support\n");
>>> + }
>>> +
>>> for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
>>> int map_idx = bnxt_cp_num_to_irq_num(bp, i);
>>> struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
>>> @@ -10893,8 +10959,10 @@ static int bnxt_request_irq(struct bnxt *bp)
>>> if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
>>> int numa_node = dev_to_node(&bp->pdev->dev);
>>> + u16 tag;
>>> irq->have_cpumask = 1;
>>> + irq->msix_nr = map_idx;
>>> cpumask_set_cpu(cpumask_local_spread(i, numa_node),
>>> irq->cpu_mask);
>>> rc = irq_set_affinity_hint(irq->vector, irq->cpu_mask);
>>> @@ -10904,6 +10972,16 @@ static int bnxt_request_irq(struct bnxt *bp)
>>> irq->vector);
>>> break;
>>> }
>>> +
>>> + bnxt_register_irq_notifier(bp, irq);
>>> +
>>> + /* Init ST table entry */
>>> + if (pcie_tph_get_cpu_st(irq->bp->pdev, TPH_MEM_TYPE_VM,
>>> + cpumask_first(irq->cpu_mask),
>>> + &tag))
>>> + break;
>>> +
>>> + pcie_tph_set_st_entry(irq->bp->pdev, irq->msix_nr, tag);
>>> }
>>> }
>>> return rc;
>>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> index 6bbdc718c3a7..ae1abcc1bddf 100644
>>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
>>> @@ -1224,6 +1224,10 @@ struct bnxt_irq {
>>> u8 have_cpumask:1;
>>> char name[IFNAMSIZ + 2];
>>> cpumask_var_t cpu_mask;
>>> +
>>> + struct bnxt *bp;
>>> + int msix_nr;
>>> + struct irq_affinity_notify affinity_notify;
>>> };
>>> #define HWRM_RING_ALLOC_TX 0x1
Powered by blists - more mailing lists