[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YuOi3i0XHV++z1YI@kroah.com>
Date: Fri, 29 Jul 2022 11:05:34 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Yicong Yang <yangyicong@...wei.com>
Cc: yangyicong@...ilicon.com, alexander.shishkin@...ux.intel.com,
leo.yan@...aro.org, james.clark@....com, will@...nel.org,
robin.murphy@....com, acme@...nel.org, peterz@...radead.org,
corbet@....net, mathieu.poirier@...aro.org, mark.rutland@....com,
jonathan.cameron@...wei.com, john.garry@...wei.com,
helgaas@...nel.org, lorenzo.pieralisi@....com,
suzuki.poulose@....com, joro@...tes.org,
shameerali.kolothum.thodi@...wei.com, mingo@...hat.com,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-pci@...r.kernel.org, linux-perf-users@...r.kernel.org,
iommu@...ts.linux-foundation.org, iommu@...ts.linux.dev,
linux-doc@...r.kernel.org, prime.zeng@...wei.com,
liuqi115@...wei.com, zhangshaokun@...ilicon.com,
linuxarm@...wei.com, bagasdotme@...il.com
Subject: Re: [PATCH v11 2/8] hwtracing: hisi_ptt: Add trace function support
for HiSilicon PCIe Tune and Trace device
On Fri, Jul 29, 2022 at 03:29:14PM +0800, Yicong Yang wrote:
> >> + /*
> >> + * Handle the interrupt on the same cpu which starts the trace to avoid
> >> + * context mismatch. Otherwise we'll trigger the WARN from the perf
> >> + * core in event_function_local().
> >> + */
> >> + WARN_ON(irq_set_affinity(pci_irq_vector(hisi_ptt->pdev, HISI_PTT_TRACE_DMA_IRQ),
> >> + cpumask_of(cpu)));
> >
> > If this hits, you just crashed the machine :(
> >
>
> We'll likely to have a calltrace here without crash the machine and reboot in
> most time, unless user has set panic_on_warn.
Again, please do not use WARN_ON for this, please read:
https://elixir.bootlin.com/linux/v5.19-rc8/source/include/asm-generic/bug.h#L74
If you want a traceback (what would you do with that?), then call the
function to give you that. Don't crash people's boxes.
> > Please properly recover from errors if you hit them, like this. Don't
> > just give up and throw a message to userspace and watch the machine
> > reboot with all data lost.
> >
> > Same for the other WARN_ON() instances here. Handle the error and
> > report it properly up the call chain.
> >
>
> The driver use WARN_ON() in two places, once in pmu::start() and another in cpu teardown's
> callback, both when the irq_set_affinity() failed. This is common to behave so when driver
> fails to set irq affinity in pmu::start() and cpu_teardown():
Don't repeat broken patterns please.
> yangyicong@...ntu:~/mainline_linux/linux/drivers$ grep -rn WARN_ON ./ | grep irq_set_affinity
> ./perf/arm_smmuv3_pmu.c:649: WARN_ON(irq_set_affinity(smmu_pmu->irq, cpumask_of(target)));
> ./perf/arm_smmuv3_pmu.c:895: WARN_ON(irq_set_affinity(smmu_pmu->irq, cpumask_of(smmu_pmu->on_cpu)));
> ./perf/arm-ccn.c:1214: WARN_ON(irq_set_affinity(ccn->irq, cpumask_of(dt->cpu)));
> ./perf/qcom_l2_pmu.c:796: WARN_ON(irq_set_affinity(cluster->irq, cpumask_of(cpu)));
> ./perf/qcom_l2_pmu.c:834: WARN_ON(irq_set_affinity(cluster->irq, cpumask_of(target)));
> ./perf/arm_dmc620_pmu.c:624: WARN_ON(irq_set_affinity(irq->irq_num, cpumask_of(target)));
> ./perf/fsl_imx8_ddr_perf.c:674: WARN_ON(irq_set_affinity(pmu->irq, cpumask_of(pmu->cpu)));
> ./perf/xgene_pmu.c:1793: WARN_ON(irq_set_affinity(xgene_pmu->irq, &xgene_pmu->cpu));
> ./perf/xgene_pmu.c:1826: WARN_ON(irq_set_affinity(xgene_pmu->irq, &xgene_pmu->cpu));
> ./perf/hisilicon/hisi_pcie_pmu.c:658: WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu)));
> ./perf/hisilicon/hisi_pcie_pmu.c:684: WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(target)));
> ./perf/hisilicon/hisi_uncore_pmu.c:495: WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(cpu)));
> ./perf/hisilicon/hisi_uncore_pmu.c:528: WARN_ON(irq_set_affinity(hisi_pmu->irq, cpumask_of(target)));
Great, you can fix all of these up as well any time :)
thanks,
greg k-h
Powered by blists - more mailing lists