lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fcfc51c0-6a1f-435b-844b-4daba132f7b6@linux.alibaba.com>
Date: Tue, 22 Jul 2025 10:43:29 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Matthew W Carlis <mattc@...estorage.com>, helgaas@...nel.org,
 Lukas Wunner <lukas@...ner.de>, anil.s.keshavamurthy@...el.com,
 bhelgaas@...gle.com, bp@...en8.de, davem@...emloft.net,
 linux-edac@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
 linux-pci@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 mark.rutland@....com, mathieu.desnoyers@...icios.com, mhiramat@...nel.org,
 naveen@...nel.org, oleg@...hat.com, peterz@...radead.org,
 rostedt@...dmis.org, tianruidong@...ux.alibaba.com, tony.luck@...el.com
Subject: Re: [PATCH v8] PCI: hotplug: Add a generic RAS tracepoinggt for
 hotplug event



在 2025/7/21 18:18, Ilpo Järvinen 写道:
> On Fri, 18 Jul 2025, Shuai Xue wrote:
>> 在 2025/7/18 11:46, Matthew W Carlis 写道:
>>> On Thu, Jul 17, 2025 Bjorn Helgaas wrote
>>>> So I think your idea of adding current link speed/width to the "Link
>>>> Up" event is still on the table, and that does sound useful to me.
>>>
>>> We're already reading the link status register here to check DLLA so
>>> it would be nice. I guess if everything is healthy we're probably already
>>> at the maximum speed by this point.
>>>
>>>> In the future we might add another tracepoint when we enumerate the
>>>> device and know the Vendor/Device ID.
>>>
>>> I think we might have someone who would be interested in doing it.
>>
>>
>> Hi, all,
>>
>> IIUC, the current hotplug event (or presence event) is enough for Matthew.
>> and we would like a new tracepoing for link speed change which reports
>> speeds.
>>
>> For hotplug event, I plan to send a new version to
>>
>> 1. address Bjorn' concerns about event strings by removing its spaces.
>>
>> #define PCI_HOTPLUG_EVENT
>> \
>> 	EM(PCI_HOTPLUG_LINK_UP,			"PCI_HOTPLUG_LINK_UP")
>> \
>> 	EM(PCI_HOTPLUG_LINK_DOWN,		"PCI_HOTPLUG_LINK_DOWN")
>> \
>> 	EM(PCI_HOTPLUG_CARD_PRESENT,		"PCI_HOTPLUG_CARD_PRESENT")
>> \
>> 	EMe(PCI_HOTPLUG_CARD_NOT_PRESENT,
>> "PCI_HOTPLUG_CARD_NOT_PRESENT")
>>
>> 2. address Ilpo comments by moving pci_hp_event to a common place
>> (include/trace/events/pci.h) so that the new comming can also use it.
> 
> Ah, I only now noticed you've decided to re-place them. Please disregard
> my other comment about this being still open/undecided item.
> 
>> For link speed change event (perhaps named as pci_link_event),
>> I plan to send a seperate patch, which provides:
>>
>> 	TP_STRUCT__entry(
>> 		__string(	port_name,	port_name	)
>> 		__field(	unsigned char,	cur_bus_speed	)
>> 		__field(	unsigned char,	max_bus_speed	)
>>   		__field(	unsigned char,	width		)
>>   		__field(	unsigned int,	flit_mode	)
>> 		__field(	unsigned char,	reason		)
>> 		),
>>
>> The reason field is from Lukas ideas which indicates why the link speed
>> changed, e.g. "hotplug", "autonomous", "thermal", "retrain", etc.
>>
>> Are you happy with above changes?
> 
> Since you're probably quite far with the pcie link event patch too given
> above, could you take a look at the LNKSTA flags representation in my
> patch and incorporate those as well as there seems to always lot of
> uncertainty about those flags when investigating the LBMS/bwctrl related
> issues so it seems prudent to explicitly include them into the traceevent
> output:
> 
> https://lore.kernel.org/linux-pci/7c289bba-3133-0989-6333-41fc41fe3504@linux.intel.com/
> 
> 

Sure, Thank you for the feedback.

I like the LNKSTA flags, LNKSTA flags provides better genericity
compared to the custom reason field I initially proposed. But it may
cause confusion when used in pcie_retrain_link(). However, I've
identified a potential issue when this approach is applied in
pcie_retrain_link() scenarios.

Consider the following trace output when a device hotpluged:

$ cat /sys/kernel/debug/tracing/trace_pipe
$ cat /sys/kernel/debug/tracing/trace_pipe
            <...>-118     [002] .....    28.414220: pci_hp_event: 0000:00:03.0 slot:30, event:PCI_HOTPLUG_CARD_PRESENT

            <...>-118     [002] .....    28.414273: pci_hp_event: 0000:00:03.0 slot:30, event:PCI_HOTPLUG_LINK_UP

    irq/57-pciehp-118     [002] .....    28.540189: pcie_link_event: 0000:00:03.0 type:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA

    irq/57-pciehp-118     [002] .....    28.544999: pcie_link_event: 0000:00:03.0 type:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA

The problem is that both trace events show status:DLLLA (Data Link Layer
Link Active), which is the direct reading from PCI_EXP_LNKSTA. However,
this doesn't accurately reflect the underlying context:

- First DLLLA: Triggered by board_added() - link establishment after
   card insertion
- Second DLLLA: Triggered by pcie_retrain_link() - link retraining
   completion

( I trace the events in pcie_update_link_speed() )

In the second case, the more relevant status would be PCI_EXP_LNKSTA_LT
(Link Training) to indicate that link retraining was performed, even
though the final register state shows DLLLA.

Question: Should we explicitly report the contextual status (e.g.,
PCI_EXP_LNKSTA_LT for retraining scenarios) rather than always reading
the current register field? This would provide more meaningful trace
information for debugging link state transitions.

Additionally, I'd appreciate your thoughts on the overall tracepoint
format shown above. Does this structure provide sufficient information
for hotplug and link analysis while maintaining readability?

Thanks.
Shuai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ