lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 21 Aug 2018 23:19:04 +0200
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Jian-Hong Pan <jian-hong@...lessm.com>
Cc:     Steve Dodd <steved424@...il.com>, Lou Reed <gogen@...root.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Experimental fix for MSI-X issue on r8169

On 20.08.2018 05:47, Jian-Hong Pan wrote:
> 2018-08-20 4:34 GMT+08:00 Heiner Kallweit <hkallweit1@...il.com>:
>> The three of you reported an MSI-X-related error when the system
>> resumes from suspend. This has been fixed for now by disabling MSI-X
>> on certain chip versions. However more versions may be affected.
>>
>> I checked with Realtek and they confirmed that on certain chip
>> versions a MSIX-related value in PCI config space is reset when
>> resuming from S3.
>>
>> I would appreciate if you could test the following experimental patch
>> and whether warning "MSIX address lost, re-configuring" appears in
>> your dmesg output after resume from suspend.
>>
>> Thanks a lot for your efforts.
> 
> Tested with the experiment patch on ASUS X441UAR.
> 
> This is the information before suspend:
> 
> dev@...less:~$ dmesg | grep r8169
> [   10.279565] libphy: r8169: probed
> [   10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4,
> XID 44900000, IRQ 127
> [   10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> 
> dev@...less:~$ ip addr show enp2s0
> 4: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP group default qlen 1000
>     link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff
>     inet 10.100.13.152/24 brd 10.100.13.255 scope global noprefixroute
> dynamic enp2s0
>        valid_lft 86347sec preferred_lft 86347sec
>     inet6 fe80::2873:a2a9:6ca1:c79d/64 scope link noprefixroute
>        valid_lft forever preferred_lft forever
> 
> This is the information after resume:
> 
> dev@...less:~$ dmesg | grep r8169
> [   10.279565] libphy: r8169: probed
> [   10.279947] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4,
> XID 44900000, IRQ 127
> [   10.445952] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   15.676229] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   17.455392] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> [   95.594265] r8169 0000:02:00.0 enp2s0: Link is Down
> [   96.242074] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> 
> dev@...less:~$ ip addr show enp2s0
> 4: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc
> pfifo_fast state DOWN group default qlen 1000
>     link/ether 0c:9d:92:32:67:b4 brd ff:ff:ff:ff:ff:ff
> 
> There is no "MSIX address lost, re-configuring" in dmesg.
> The ethernet interface is still down after resume.
> 

Thanks a lot for testing. Unfortunately I don't have test hardware
affected by this MSI-X issue, so maybe you can help me to understand
the issue a little better.

Below is a patch printing the MSI-X table entry in different contexts,
it's not supposed to fix anything. Could you please let me know
what the output is on your system?
I want to get an idea whether the issue clears the complete entry or
just corrupts certain parts.

That's what I get on my system (RTL8168E-VL). In your case you'll come
only till the first suspend.

[    3.743404] r8169 0000:03:00.0: MSI-X entry: context probe: fee01004 0 40ef 1
[   29.539250] r8169 0000:03:00.0: MSI-X entry: context suspend: fee02004 0 4028 0
[   29.837457] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0
[   36.921370] r8169 0000:03:00.0: MSI-X entry: context suspend: fee01004 0 402b 0
[   37.239407] r8169 0000:03:00.0: MSI-X entry: context resume: fee01004 0 402b 0

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 54f53c8c0..f32645119 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/moduleparam.h>
 #include <linux/pci.h>
+#include <linux/msi.h>
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
 #include <linux/delay.h>
@@ -6822,6 +6823,20 @@ rtl8169_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	pm_runtime_put_noidle(&pdev->dev);
 }
 
+static void rtl_print_msix_entry(struct rtl8169_private *tp, const char *context)
+{
+	struct msi_desc *desc = first_pci_msi_entry(tp->pci_dev);
+	u32 data[4];
+
+	data[0] = readl(desc->mask_base + PCI_MSIX_ENTRY_LOWER_ADDR);
+	data[1] = readl(desc->mask_base + PCI_MSIX_ENTRY_UPPER_ADDR);
+	data[2] = readl(desc->mask_base + PCI_MSIX_ENTRY_DATA);
+	data[3] = readl(desc->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL);
+
+	dev_info(tp_to_dev(tp), "MSI-X entry: context %s: %x %x %x %x\n",
+		 context, data[0], data[1], data[2], data[3]);
+}
+
 static void rtl8169_net_suspend(struct net_device *dev)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
@@ -6846,9 +6861,12 @@ static int rtl8169_suspend(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = netdev_priv(dev);
 
 	rtl8169_net_suspend(dev);
 
+	rtl_print_msix_entry(tp, "suspend");
+
 	return 0;
 }
 
@@ -6875,6 +6893,9 @@ static int rtl8169_resume(struct device *device)
 {
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
+	struct rtl8169_private *tp = netdev_priv(dev);
+
+	rtl_print_msix_entry(tp, "resume");
 
 	if (netif_running(dev))
 		__rtl8169_resume(dev);
@@ -7075,11 +7096,6 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
 		RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~MSIEnable);
 		RTL_W8(tp, Cfg9346, Cfg9346_Lock);
 		flags = PCI_IRQ_LEGACY;
-	} else if (tp->mac_version == RTL_GIGA_MAC_VER_40) {
-		/* This version was reported to have issues with resume
-		 * from suspend when using MSI-X
-		 */
-		flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;
 	} else {
 		flags = PCI_IRQ_ALL_TYPES;
 	}
@@ -7354,6 +7370,8 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		return rc;
 	}
 
+	rtl_print_msix_entry(tp, "probe");
+
 	tp->saved_wolopts = __rtl8169_get_wol(tp);
 
 	mutex_init(&tp->wk.mutex);
-- 
2.18.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ