lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 24 Feb 2023 21:21:32 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     fk1xdcio@...k.com, netdev@...r.kernel.org
Subject: Re: 4-port ASMedia/RealTek RTL8125 2.5Gbps NIC freezes whole system

On 24.02.2023 15:37, fk1xdcio@...k.com wrote:
> I hope this is the correct place to ask this(?). I'm not sure if my large attachments will come through; this is my first attempt.
> 
> I'm having problems getting this 4-port 2.5Gbps NIC to be stable. I have tried on multiple different physical systems both with Xeon server and i7 workstation chipsets and it behaves the same way on everything. Testing with latest Arch Linux and kernels 6.1, 6.2, and 5.15. I'm using the kernel default r8169 driver.
> 
> The higher the load on the NIC the more likely the whole system freezes hard. Everything freezes including my serial console, SysRq doesn't work, even the motherboard hardware reset switch doesn't work(!). I have to cut power to the system to reset it.
> 
> Disabling IOMMU is more stable but doesn't fix the issue. ASPM doesn't work correctly on this card either despite the ASMedia 1812 supposedly supporting it (lots of corrected PCIe errors). Enabling or disabling ASPM makes no difference.
> 
> "SSU-TECH" (generic/counterfeit?) 4-port 2.5Gbps PCIe x4 card
>   ASMedia ASM1812 PCIe switch (driver: pcieport)
>   RTL8125BG x4 (driver: r8169)
> 
> I have tested with a normal network configuration consisting of multiple machines and also with lookback cables plugging the card ports in to itself.
> 
> I have attached the scripts I use with the loopback cables (crashsys.sh), lspci, and dmesg.
> 
> System freezes almost immediately with:
>   3,1266,4284361895,-;pcieport 0000:04:02.0: Unable to change power state from D3hot to D0, device inaccessible
>    SUBSYSTEM=pci
>    DEVICE=+pci:0000:04:02.0
> 
> If I set permanent D0 mode (power/control=on) then the error is different when the system freezes:
>   r8169 0000:0d:00.0 enp13s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
> 
> Is there anything I can do to get more debugging information? The system locks so hard that I haven't gotten much so far. It's unclear if the problem is happening in the pcieport driver, r8169, or somewhere else.

The network driver shouldn't be able to freeze the system. You can test whether vendor driver r8125 makes a difference.
This should provide us with an idea whether the root cause is at a lower level.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ