[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CBA48B605613B640864883107238931C0E5789@nice.asicdesigners.com>
Date: Thu, 23 Jan 2014 02:17:39 +0000
From: Dimitrios Michailidis <dm@...lsio.com>
To: Gavin Shan <shangw@...ux.vnet.ibm.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH 2/2] net/cxgb4: Don't retrieve stats during recovery
On Wed, Jan 22, 2014 at 5:32 PM, Gavin Shan wrote:
> >On Mon, Jan 20, 2014 at 02:05:18PM -0800, Dimitris Michailidis wrote:
> >>On 01/19/2014 07:05 PM, Gavin Shan wrote:
> >>>We possiblly retrieve the adapter's statistics during EEH recovery
> >>>and that should be disallowed. Otherwise, it would possibly incur
> >>>replicate EEH error and EEH recovery is going to fail eventually.
> >>>The patch checks if the PCI device is off-line before statistic
> >>>retrieval.
> >>
> >>The net_devices are detached during EEH so I think
> >>netif_device_present is a better check than pci_channel_offline. I
> >>am not sure such a test should be left to each driver though. If you
> >>do end up putting it in the driver it needs better synchronization
> >>with the EEH handlers as Ben mentioned.
> >>
> >
> >Ok. I agree that netif_device_present() is better since the statistics
> >is more net_device specific (other than pci_dev). And it's more accurate
> >to use netif_device_present() based on what we have:
> >
> > pci_channel_offline() ----------+
> > eeh_err_detected() |
> > !netif_device_present() --+-----+
> > <EEH recovery> | |
> > !pci_channel_offline() ----------+ |
> > eeh_slot_reset() |
> > eeh_resume() |
> > netif_device_present() --------+
> >
> >For the syncrhonization, I think we can just reuse the "adap->stats_lock".
> >Something like this:
> >
> >static pci_ers_result_t eeh_err_detected(struct pci_dev *pdev,
> > pci_channel_state_t state)
> >{
> > :
> > spin_lock(&adap->stats_lock);
> > for_each_port(adap, i) {
> > struct net_device *dev = adap->port[i];
> >
> > netif_device_detach(dev);
> > netif_carrier_off(dev);
> > }
> > spin_unlock(&adap->stats_lock);
> > :
> >}
> >
> >static void eeh_resume(struct pci_dev *pdev)
> >{
> > :
> > spin_lock(&adap->stats_lock);
> > for_each_port(adap, i) {
> > struct net_device *dev = adap->port[i];
> >
> > if (netif_running(dev)) {
> > link_start(dev);
> > cxgb_set_rxmode(dev);
> > }
> > netif_device_attach(dev);
> > }
> > spin_unlock(&adap->stats_lock);
> > :
> >}
Both link_start and cxgb_set_rxmode here issue blocking commands to FW, these two cannot be under a spinlock. In fact I don't think you need locking here at all. The devices can be attached asynchronously relative to the stats code, we don't care if it races. On detach it matters but not here.
> >static struct rtnl_link_stats64 *cxgb_get_stats(struct net_device *dev,
> > struct rtnl_link_stats64 *ns)
> >{
> > :
> > spin_lock(&adapter->stats_lock);
> > if (!netif_device_present(dev)) {
> > spin_unlock(&adapter->stats_lock);
> > return ns;
> > }
> > t4_get_port_stats(adapter, p->tx_chan, &stats);
> > spin_unlock(&adapter->stats_lock);
> > :
> >}
> >
>
> Dimitris, Any more comments on this? :-)
Just the above. Thanks.
> If you think it's fine, I'm going to change it like this and send
> out "v2".
>
> Thanks,
> Gavin
>
> >>>
> >>>Signed-off-by: Gavin Shan <shangw@...ux.vnet.ibm.com>
> >>>---
> >>> drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 11 +++++++++++
> >>> 1 file changed, 11 insertions(+)
> >>>
> >>>diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> >>>index c8eafbf..b0e72fb 100644
> >>>--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> >>>+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> >>>@@ -4288,6 +4288,17 @@ static struct rtnl_link_stats64
> *cxgb_get_stats(struct net_device *dev,
> >>> struct port_info *p = netdev_priv(dev);
> >>> struct adapter *adapter = p->adapter;
> >>>
> >>>+ /*
> >>>+ * We possibly retrieve the statistics while the PCI
> >>>+ * device is off-line. That would cause the recovery
> >>>+ * on off-lined PCI device going to fail. So it's
> >>>+ * reasonable to block it during the recovery period.
> >>>+ */
> >>>+ if (pci_channel_offline(adapter->pdev)) {
> >>>+ memset(ns, 0, sizeof(*ns));
> >>>+ return ns;
> >>>+ }
> >>>+
> >>> spin_lock(&adapter->stats_lock);
> >>> t4_get_port_stats(adapter, p->tx_chan, &stats);
> >>> spin_unlock(&adapter->stats_lock);
> >>>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists