lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB4CAwfD2iFN4j8-r_kEgShPxQ98U9gdntWk+3Y3aRCOJhj5mw@mail.gmail.com>
Date:   Wed, 19 Dec 2018 22:37:00 +0800
From:   Chris Chiu <chiu@...lessm.com>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     nic_swsd <nic_swsd@...ltek.com>, davem@...emloft.net,
        netdev@...r.kernel.org,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Linux Upstreaming Team <linux@...lessm.com>
Subject: Re: A weird problem of Realtek r8168 after resume from S3

On Wed, Dec 19, 2018 at 2:22 AM Heiner Kallweit <hkallweit1@...il.com> wrote:
>
> On 18.12.2018 14:25, Chris Chiu wrote:
> > On Tue, Dec 18, 2018 at 3:08 AM Heiner Kallweit <hkallweit1@...il.com> wrote:
> >>
> >> On 17.12.2018 14:25, Chris Chiu wrote:
> >>> On Fri, Dec 14, 2018 at 3:37 PM Heiner Kallweit <hkallweit1@...il.com> wrote:
> >>>>
> >>>> On 14.12.2018 04:33, Chris Chiu wrote:
> >>>>> On Thu, Dec 13, 2018 at 10:20 AM Chris Chiu <chiu@...lessm.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>     We got an acer laptop which has a problem with ethernet networking after
> >>>>>> resuming from S3. The ethernet is popular realtek r8168. The lspci shows as
> >>>>>> follows.
> >>>>>> 02:00.1 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
> >>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 12)
> >>>>>>
> >>>> Helpful would be a "dmesg | grep r8169", especially chip name + XID.
> >>>>
> >>> [   22.362774] r8169 0000:02:00.1 (unnamed net_device)
> >>> (uninitialized): mac_version = 0x2b
> >>> [   22.365580] libphy: r8169: probed
> >>> [   22.365958] r8169 0000:02:00.1 eth0: RTL8411, 00:e0:b8:1f:cb:83,
> >>> XID 5c800800, IRQ 38
> >>> [   22.365961] r8169 0000:02:00.1 eth0: jumbo features [frames: 9200
> >>> bytes, tx checksumming: ko]
> >>>
> >> Thanks for the info.
> >>
> >>>>>>     The problem is the ethernet is not accessible after resume. Pinging via
> >>>>>> ethernet always shows the response `Destination Host Unreachable`. However,
> >>>>>> the interesting part is, when I run tcpdump to monitor the problematic ethernet
> >>>>>> interface, the networking is back to alive. But it's dead again after
> >>>>>> I stop tcpdump.
> >>>>>> One more thing, if I ping the problematic machine from others, it achieves the
> >>>>>> same effect as above tcpdump. Maybe it's about the register setting for RX path?
> >>>>>>
> >>>> You could compare the register dumps (ethtool -d) before and after S3 sleep
> >>>> to find out whether there's a difference.
> >>>>
> >>>
> >>> Actually, I just found I lead the wrong direction. The S3 suspend does
> >>> help to reproduce,
> >>> but it's not necessary. All I need to do is ping around 5 mins and the
> >>> network connection
> >>> fails.  And I also find one thing interesting, disabling the  MSI-X
> >>> interrupt like commit
> >>> [d49c88d7677ba737e9d2759a87db0402d5ab2607] can fix this problem.
> >>> Although I don't
> >>> understand the root cause. Anything I can do to help?
> >>>
> >> This is indeed very, very weird. You say switching from MSI-X to MSI fixes
> >> the issue, but also pinging the machine from outside brings back the network.
> >> Both actions affect totally different corners.
> >>
> >> The commit and related issue you mention was a workaround in the driver,
> >> the root cause was a MSI-X-related  issue with certain Intel chipsets deep
> >> in the PCI core. After this was fixed we removed the workaround again.
> >> This shouldn't be related to your issue.
> >>
> >> Hard to say for now is whether the issue is:
> >> - a driver issue
> >> - a hardware issue in the RTL8411
> >> - an issue with the chipset on your mainboard
> >>
> >> According to your description it doesn't take a special scenario to trigger
> >> the issue, so most likely also other users of Acer notebooks with RTL8411
> >> should be affected (after briefly checking this should be at least Aspire
> >> F15, V15, V7). Therefore I wonder why there aren't more reports.
> >>
> >> This commit added MSI-X support: 6c6aa15fdea5 ("r8169: improve interrupt handling")
> >> So you could test this revision and the one before.
> >>
> >> Eventually, if the issue really should be caused by a side effect of using
> >> MSI-X, then the question is whether we need to disable MSI-X for RTL8411
> >> in general or just for RTL8411 and a certain subsystem id.
> >>
> >
> > I tried the kernel with the head on 6c6aa15fdea5 ("r8169: improve
> > interrupt handling"),
> > the problem still there. Then I revert to the previous revision, the
> > problem goes away.
> > So I think it's pretty much the side effect of MSI-X. However, as you
> > mentioned that
> > you didn't hit this problem, I'll ask the vendor to verify if this
> > problem also happens on
> > other machines with the same chip. Then we can determine to disable for specific
> > mac version or just a certain subsystem id.
> >
> Thanks a lot for testing. OK, I have one more idea.
> AFAICS RTL8411 also has an integrated card reader controller which is driven
> by module rtsx_pci. Maybe if both components (card reader controller + ethernet)
> use different interrupt types, RTL8411 can't properly handle this.
> In case module rtsx_pci is loaded on your system, can you check whether not
> loading it (e.g. by blacklisting) or removing it makes a difference?
>

I boot my kernel with rtsx_pci_ms/rtsx_pci on blacklist, but it
doesn't change anything.

> Can you provide the "lspci -v" output for the card reader part of RTL8411?

02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd.
RTL8411B PCI Express Card Reader (rev 01)
        Subsystem: Acer Incorporated [ALI] RTL8411B PCI Express Card Reader
        Flags: bus master, fast devsel, latency 0, IRQ 34
        Memory at f0b05000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at f0b10000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
        Capabilities: [d0] Vital Product Data
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [140] Virtual Channel
        Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [170] Latency Tolerance Reporting
        Capabilities: [178] L1 PM Substates
        Kernel driver in use: rtsx_pci
        Kernel modules: rtsx_pci

>
> >>>>>>     I tried the latest 4.20 rc version but the problem still there. I
> >>>>>> also tried some
> >>>>>> hw_reset or init thing in the resume path but no effect. Any
> >>>>>> suggestion for this?
> >>>>>> Thanks
> >>>>>>
> >>>> Did previous kernel versions work? If it's a regression, a bisect would be
> >>>> appreciated, because with the chip versions I've got I can't reproduce the issue.
> >>>>
> >>>>>> Chris
> >>>>>
> >>>>> Gentle ping. Any additional information required?
> >>>>>
> >>>>> Chris
> >>>>>
> >>>> Heiner
> >>>
> >>
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ