lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <79ada41b-5d10-e063-1d15-3ed5e78f3720@linux.intel.com>
Date: Thu, 17 Oct 2024 16:16:43 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
cc: linux-pci@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>, 
    Lorenzo Pieralisi <lorenzo.pieralisi@....com>, 
    Rob Herring <robh@...nel.org>, 
    Krzysztof Wilczyński <kw@...ux.com>, 
    "Maciej W. Rozycki" <macro@...am.me.uk>, Lukas Wunner <lukas@...ner.de>, 
    Alexandru Gagniuc <mr.nuke.me@...il.com>, 
    Krishna chaitanya chundru <quic_krichai@...cinc.com>, 
    Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>, 
    "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org, 
    Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>, 
    LKML <linux-kernel@...r.kernel.org>, 
    Daniel Lezcano <daniel.lezcano@...aro.org>, 
    Amit Kucheria <amitk@...nel.org>, Zhang Rui <rui.zhang@...el.com>, 
    Christophe JAILLET <christophe.jaillet@...adoo.fr>
Subject: Re: [PATCH v8 6/8] PCI/bwctrl: Add API to set PCIe Link Speed

On Thu, 17 Oct 2024, Jonathan Cameron wrote:

> On Wed,  9 Oct 2024 12:52:21 +0300
> Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com> wrote:
> 
> > Currently, PCIe Link Speeds are adjusted by custom code rather than in
> > a common function provided in PCI core. PCIe bandwidth controller
> > (bwctrl) introduces an in-kernel API to set PCIe Link Speed.
> > 
> > Convert Target Speed quirk to use the new API. The Target Speed quirk
> > runs very early when bwctrl is not yet probed for a Port and can also
> > run later when bwctrl is already setup for the Port, which requires the
> > per port mutex (set_speed_mutex) to be only taken if the bwctrl setup
> > is already complete.
> > 
> > The new API is also intended to be used in an upcoming commit that adds
> > a thermal cooling device to throttle PCIe bandwidth when thermal
> > thresholds are reached.
> > 
> > The PCIe bandwidth control procedure is as follows. The highest speed
> > supported by the Port and the PCIe device which is not higher than the
> > requested speed is selected and written into the Target Link Speed in
> > the Link Control 2 Register. Then bandwidth controller retrains the
> > PCIe Link.
> > 
> > Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus
> > to keep track PCIe Link Speed changes. While Bandwidth Notifications
> > should also be generated when bandwidth controller alters the PCIe Link
> > Speed, a few platforms do not deliver LMBS interrupt after Link
> > Training as expected. Thus, after changing the Link Speed, bandwidth
> > controller makes additional read for the Link Status Register to ensure
> > cur_bus_speed is consistent with the new PCIe Link Speed.
> > 
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
> 
> Trivial stuff inline.  The mutex_destroy discussion is a just a consistency
> thing given that call is rarely bothered with but here it might help with
> debug.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> 
> Jonathan
> 
> > ---
> >  drivers/pci/pci.h         |  20 +++++
> >  drivers/pci/pcie/bwctrl.c | 161 +++++++++++++++++++++++++++++++++++++-
> >  drivers/pci/quirks.c      |  17 +---
> >  include/linux/pci.h       |  10 +++
> >  4 files changed, 193 insertions(+), 15 deletions(-)
> 
> 
> > diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c
> > index 1b11b5da79d4..1d3680ea8e06 100644
> > --- a/drivers/pci/pcie/bwctrl.c
> > +++ b/drivers/pci/pcie/bwctrl.c
> > @@ -7,6 +7,11 @@
> 
> >  static void pcie_bwnotif_enable(struct pcie_device *srv)
> >  {
> >  	struct pcie_bwctrl_data *data = get_service_data(srv);
> > @@ -135,6 +288,7 @@ static int pcie_bwnotif_probe(struct pcie_device *srv)
> >  	if (!data)
> >  		return -ENOMEM;
> >  
> > +	mutex_init(&data->set_speed_mutex);
> >  	set_service_data(srv, data);
> >  
> >  	ret = request_threaded_irq(srv->irq, NULL, pcie_bwnotif_irq_thread,
> > @@ -142,8 +296,10 @@ static int pcie_bwnotif_probe(struct pcie_device *srv)
> >  	if (ret)
> >  		return ret;
> >  
> > -	port->link_bwctrl = no_free_ptr(data);
> > -	pcie_bwnotif_enable(srv);
> > +	scoped_guard(rwsem_write, &pcie_bwctrl_remove_rwsem) {
> 
> Calling it remove_rwsem and using it to protect against not yet
> present seems odd. Maybe rename, pcie_bwctrl_bound_rswem or something like that?

Okay but there's one related issue I need to address:

I came across recursive locking splat from pcie_bwctrl_remove_rwsem
(seems I had lockdep off for some odd reason in my config so it didn't
trigger during my testing until now). The splat is because of:

pcie_set_target_speed()
   scoped_guard(rwsem_read, &pcie_bwctrl_remove_rwsem)  
   -> pcie_bwctrl_change_speed
      -> pcie_retrain_link()
         -> pcie_reset_lbms_count()
            guard(rwsem_read)(&pcie_bwctrl_remove_rwsem);

I wouldn't want to special case pcie_retrain_link() just for BW 
control's sake, I plan to solve that by splitting pcie_bwctrl_remove_rwsem 
to have one for LBMS API and another for set speed API so split it into:

- pcie_bwctrl_setspeed_rwsem
- pcie_bwctrl_lbms_rwsem

It requires taking both in .probe and .remove which is not as elegant I'd 
want but definitely seems more elegant compared with differentiating 
pcie_retrain_link().

If you, for some reason would instead prefer I differentiated 
pcie_retrain_link() for cases when holding pcie_bwctrl_remove_rwsem 
already (or pcie_bwctrl_bound_rwsem as suggested by you), please let me 
know.

> > +		port->link_bwctrl = no_free_ptr(data);
> > +		pcie_bwnotif_enable(srv);
> > +	}
> >  
> >  	pci_dbg(port, "enabled with IRQ %d\n", srv->irq);
> >  
> > @@ -159,6 +315,7 @@ static void pcie_bwnotif_remove(struct pcie_device *srv)
> >  		srv->port->link_bwctrl = NULL;
> >  
> >  	free_irq(srv->irq, srv);
> > +	mutex_destroy(&data->set_speed_mutex);
>
> Probably not worth doing.  Also you don't do error handling for this above.

I'm not sure what error handling you're referring to?

> Ideal is use devm_ for data and then devm_mutex_init()

I've now converted alloc, irq, and mutex to devm. I had them at one point 
but then Lukas said it's the wrong direction consider what lies in future 
but since the future seeming has now changed I've no problem of making our
lives easier with devm. :-)

I also got rid of the service data use for your convinience.

-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ