lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 1 Jan 2024 19:37:25 +0200 (EET)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Lukas Wunner <lukas@...ner.de>
cc: linux-pci@...r.kernel.org, Bjorn Helgaas <helgaas@...nel.org>, 
    Lorenzo Pieralisi <lorenzo.pieralisi@....com>, 
    Rob Herring <robh@...nel.org>, Krzysztof Wilczy??ski <kw@...ux.com>, 
    Alexandru Gagniuc <mr.nuke.me@...il.com>, 
    Krishna chaitanya chundru <quic_krichai@...cinc.com>, 
    Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>, 
    "Rafael J . Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org, 
    Bjorn Helgaas <bhelgaas@...gle.com>, LKML <linux-kernel@...r.kernel.org>, 
    Alex Deucher <alexdeucher@...il.com>, 
    Daniel Lezcano <daniel.lezcano@...aro.org>, 
    Amit Kucheria <amitk@...nel.org>, Zhang Rui <rui.zhang@...el.com>
Subject: Re: [PATCH v3 07/10] PCI/LINK: Re-add BW notification portdrv as
 PCIe BW controller

On Sat, 30 Dec 2023, Lukas Wunner wrote:

> On Fri, Sep 29, 2023 at 02:57:20PM +0300, Ilpo Järvinen wrote:
> > This mostly reverts b4c7d2076b4e ("PCI/LINK: Remove bandwidth
> > notification"), however, there are small tweaks:
> > 
> > 1) Call it PCIe bwctrl (bandwidth controller) instead of just
> >    bandwidth notifications.
> > 2) Don't print the notifications into kernel log, just keep the current
> >    link speed updated.
> > 3) Use concurrency safe LNKCTL RMW operations.
> > 4) Read link speed after enabling the notification to ensure the
> >    current link speed is correct from the start.
> > 5) Add local variable in probe for srv->port.
> > 6) Handle link speed read and LBMS write race in
> >    pcie_bw_notification_irq().
> > 
> > The reason for 1) is to indicate the increased scope of the driver. A
> > subsequent commit extends the driver to allow controlling PCIe
> > bandwidths from user space upon crossing thermal thresholds.
> > 
> > While 2) is somewhat unfortunate, the log spam was the source of
> > complaints that eventually lead to the removal of the bandwidth
> > notifications driver (see the links below for further information).
> > After re-adding this driver back the userspace can, if it wishes to,
> > observe the link speed changes using the current bus speed files under
> > sysfs.
> 
> Good commit message.
> 

> > --- /dev/null
> > +++ b/drivers/pci/pcie/bwctrl.c
> 
> > +static void pcie_enable_link_bandwidth_notification(struct pci_dev *dev)
> > +{
> > +	u16 link_status;
> > +	int ret;
> > +
> > +	pcie_capability_write_word(dev, PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_LBMS);
> > +	pcie_capability_set_word(dev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_LBMIE);
> 
> I'm wondering why we're not enabling LABIE as well?
> (And clear LABS.)
> 
> Can't it happen that we miss bandwidth changes unless we enable that
> as well?

Thanks. Reading the spec, it sounds like both are necessary to not miss 
changes.

> > +static int pcie_bandwidth_notification_probe(struct pcie_device *srv)
> > +{
> > +	struct pci_dev *port = srv->port;
> > +	int ret;
> > +
> > +	/* Single-width or single-speed ports do not have to support this. */
> > +	if (!pcie_link_bandwidth_notification_supported(port))
> > +		return -ENODEV;
> 
> I'm wondering if this should be checked in get_port_device_capability()
> instead?

I can move the check there.

> > +	ret = request_irq(srv->irq, pcie_bw_notification_irq,
> > +			  IRQF_SHARED, "PCIe BW ctrl", srv);
> 
> Is there a reason to run the IRQ handler in hardirq context
> or would it work to run it in an IRQ thread?  Usually on systems
> than enable PREEMPT_RT, a threaded IRQ handler is preferred,
> so unless hardirq context is necessary, I'd recommend using
> an IRQ thread.

Can I somehow postpone the decision between IRQ_NONE / IRQ_HANDLED
straight into the thread_fn? One LNKSTA read is necessary to decide 
that.

I suppose the other write + reread of LNKSTA could be moved into
thread_fn even if the first read would not be movable.


-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ