[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6af283ea-bd36-44a7-949a-2ab8c80cf136@linux.alibaba.com>
Date: Thu, 22 May 2025 17:50:05 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Lukas Wunner <lukas@...ner.de>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
Bjorn Helgaas <bhelgaas@...gle.com>
Cc: rostedt@...dmis.org, linux-pci@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>, linux-edac@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, helgaas@...nel.org, bhelgaas@...gle.com,
tony.luck@...el.com, bp@...en8.de, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, oleg@...hat.com, naveen@...nel.org,
davem@...emloft.net, anil.s.keshavamurthy@...el.com, mark.rutland@....com,
peterz@...radead.org, tianruidong@...ux.alibaba.com
Subject: Re: [PATCH v8] PCI: hotplug: Add a generic RAS tracepoint for hotplug
event
在 2025/5/20 21:11, Lukas Wunner 写道:
> On Tue, May 20, 2025 at 03:52:56PM +0300, Ilpo Järvinen wrote:
>> On Tue, 20 May 2025, Lukas Wunner wrote:
>>> A link speed event could contain a "reason" field
>>> which indicates why the link speed changed,
>>> e.g. "hotplug", "autonomous", "thermal", "retrain", etc.
>>>
>>> In other words, instead of mixing the infomation for hotplug
>>> and link speed events together in one event, a separate link
>>> speed event could point to hotplug as one possible reason for
>>> the new speed.
>>
>> It will be somewhat challenging to link LBMS into what caused it,
>> especially in cases where there is more than one LBMS following a single
>> Link Retraining.
>>
>> Do you have opinion on should the event be only recorded from LBMS/LABS
>> if the speed changed from the previous value? The speed should probably
>> also be reported also for the first time (initial enumeration, hotplugging
>> a new board).
>
> One idea would be to amend struct pcie_bwctrl_data with an
> enum member describing the reason.
>
> pcie_bwnotif_irq() uses that reason when reporting the speed change
> in a trace event.
>
> After an Endpoint has been removed, the Downstream Port or Root Port
> above resets the reason to "hotplug", so that the next link event
> is assigned that reason.
>
> Similarly pcie_set_target_speed() could be amended with an enum argument
> for the reason and it would set that in struct pcie_bwctrl_data before
> calling pcie_bwctrl_change_speed().
>
> Thanks,
>
> Lukas
Hi Lukas and Ilpo,
Thank you for the discussion.
As @Lukas points out, link speed changes and device plug/unplug events are
orthogonal issues.
Based on this thread discussion, I believe we need additional tweaking to
introduce a new tracepoint (perhaps named PCI_LINK_EVENT) to handle
link speed changes separately.
Regarding our next steps, would it be acceptable to merge the
PCI_HOTPLUG_EVENT to mainline first, and then work on implementing
the new link event tracepoint afterward?
Best regards,
Shuai
Powered by blists - more mailing lists