lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZzDjBQaO2YjUlsjz@wunner.de>
Date: Sun, 10 Nov 2024 17:44:53 +0100
From: Lukas Wunner <lukas@...ner.de>
To: Shuai Xue <xueshuai@...ux.alibaba.com>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-edac@...r.kernel.org, bhelgaas@...gle.com,
	tony.luck@...el.com, bp@...en8.de
Subject: Re: [RFC PATCH] PCI: pciehp: Generate a RAS tracepoint for hotplug
 event

On Sun, Nov 10, 2024 at 06:12:09PM +0800, Shuai Xue wrote:
> 2024/11/10 01:52, Lukas Wunner:
> > On Fri, Nov 08, 2024 at 11:09:39AM +0800, Shuai Xue wrote:
> > > --- a/drivers/pci/hotplug/pciehp_ctrl.c
> > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> > > @@ -19,6 +19,7 @@
> > >   #include <linux/types.h>
> > >   #include <linux/pm_runtime.h>
> > >   #include <linux/pci.h>
> > > +#include <ras/ras_event.h>
> > >   #include "pciehp.h"
> > 
> > Hm, why does the TRACE_EVENT() definition have to live in ras_event.h?
> > Why not, say, in pciehp.h?
> 
> IMHO, it is a type of RAS related event, so I add it in ras_event.h,
> similar to other events like aer_event and memory_failure_event.
> 
> I could move it to pciehp.h, if the maintainers prefer that location.

IMO pciehp.h makes more sense than ras/ras_event.h.

The addition of AER to ras/ras_event.h was over a decade ago with
commit 0a2409aad38e ("trace, AER: Move trace into unified interface").
That commit wasn't acked by Bjorn.  It wasn't even cc'ed to linux-pci:

https://lore.kernel.org/all/1402475691-30045-3-git-send-email-gong.chen@linux.intel.com/

I can see a connection between AER and RAS, but PCI hotplug tracepoints
are not exclusively RAS, they might be useful for other purposes as well.
Note that pciehp is not just used on servers but also e.g. for Thunderbolt
on mobile devices and the tracepoints might come in handy to debug that.


> > Wouldn't it be more readable to just log the event that occured
> > as a string, e.g. "Surprise Removal" (and "Insertion" or "Hot Add"
> > for the other trace event you're introducing) instead of the state?
> > 
> > Otherwise you see "ON_STATE" in the log but that's actually the
> > *old* value so you have to mentally convert this to "previously ON,
> > so now must be transitioning to OFF".
> 
> I see your point. "Surprise Removal" or "Insertion" is indeed the exact
> state transition. However, I am concerned that using a string might make
> it difficult for user space tools like rasdaemon to parse.

If this is parsed by a user space daemon, put the enum in a uapi header,
e.g. include/uapi/linux/pci.h.


> How about adding a new enum for state transition? For example:
> 
> 	enum pciehp_trans_type {
> 		PCIEHP_SAFE_REMOVAL,
> 		PCIEHP_SURPRISE_REMOVAL,
> 		PCIEHP_Hot_Add,
> 	...
> 	}

In that case, I'd suggest adding an entry to the enum for all the
ctrl_info() messages, i.e.

Link Up
Link Down
Card present
Card not present

Amend pciehp_handle_presence_or_link_change() with curly braces
around all the affected if-blocks and put a trace event next to the
ctrl_info() message.

Also, since these events are not pciehp-specific, I'd call the enum
something like pci_hotplug_event and the entries PCI_HOTPLUG_...
(or PCI_HP_... if you prefer short names).  These trace events could
in principle be raised by any of the other hotplug drivers in
drivers/pci/hotplug/, not just pciehp.

Thanks,

Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ