lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 24 May 2019 19:55:59 +0200
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Paul Menzel <pmenzel@...gen.mpg.de>,
        Realtek linux nic maintainers <nic_swsd@...ltek.com>
Cc:     netdev@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: r8169: Link only up after 16 s (A link change request failed with
 some changes committed already. Interface enp3s0 may have been left with an
 inconsistent configuration, please check.)

On 24.05.2019 17:14, Paul Menzel wrote:
> Dear Heiner,
> 
> 
> Thank you for the quick reply.
> 
> On 05/23/19 19:44, Heiner Kallweit wrote:
>> On 23.05.2019 13:00, Paul Menzel wrote:
> 
>>> I optimized the Linux kernel configuration on my ASRock E350M1, and it now
>>> boots really fast.
>>>
>>> Unfortunately, that seems to cause the network driver to hit some corner
>>> case, so that the link is supposedly down, although it should be up. The
>>> cable is plugged in the whole time.
>>>
>>> ```
>>> [    2.990757] libphy: r8169: probed
>>> [    2.992661] r8169 0000:03:00.0 eth0: RTL8168e/8111e, bc:5f:f4:c8:d3:98, XID 2c2, IRQ 28
>>> [    2.992669] r8169 0000:03:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>> [    3.294484] usb 5-2: new low-speed USB device number 2 using ohci-pci
>>> [    3.458711] usb 5-2: New USB device found, idVendor=1241, idProduct=1122, bcdDevice= 1.00
>>> [    3.458718] usb 5-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
>>> [    3.485065] input: HID 1241:1122 as /devices/pci0000:00/0000:00:12.0/usb5/5-2/5-2:1.0/0003:1241:1122.0001/input/input14
>>> [    3.485320] hid-generic 0003:1241:1122.0001: input,hidraw0: USB HID v1.00 Mouse [HID 1241:1122] on usb-0000:00:12.0-2/input0
>>> [    3.967622] random: crng init done
>>> [    3.967628] random: 7 urandom warning(s) missed due to ratelimiting
>>> [    4.323449] r8169 0000:03:00.0 enp3s0: renamed from eth0
>>> [    4.363774] RTL8211DN Gigabit Ethernet r8169-300:00: attached PHY driver [RTL8211DN Gigabit Ethernet] (mii_bus:phy_addr=r8169-300:00, irq=IGNORE)
>>> [    4.576887] r8169 0000:03:00.0 enp3s0: Link is Down
>>> [    4.577167] A link change request failed with some changes committed already. Interface enp3s0 may have been left with an inconsistent configuration, please check.
>>> [   16.377520] r8169 0000:03:00.0 enp3s0: Link is Up - 100Mbps/Full - flow control rx/tx
>>> [   16.377553] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
>>> ```
>>>
>>> It happens with all Linux kernels I tried. Please find all Linux
>>> messages attached.
>>>
>>> Could you please tell me, how this can be debugged and solved?
>>>
>> This warning is triggered by errors in do_setlink() in net/core/rtnetlink.c
>> I'd say:
>> 1. Which kernel config options did you change as part of the optimization?
>>    (If I understand you correctly the warning didn't pop up before.)
> 
> Sorry for being unclear. The same problem happens with Debian
> Sid/unstable’s default Linux kernel 4.19.0-5 (4.19.37) [1]. With the fast
> boot (six seconds from pressing power button to Weston) I just noticed, that
> the network was not set up when wanting to use it.
> 
>> 2. Try to find out which call in do_setlink() fails and which errno
>> is returned.
> 
> Yeah, that’s where I need help. ;-)
> 
> I applied the simple change below to `net/core/rtnetlink.c`.
> 
>                 if (err < 0)
> -                       net_warn_ratelimited("A link change request failed with some changes committed already. Interface %s may have been left with an inconsistent configuration, please check.\n",
> -                                            dev->name);
> +                       net_warn_ratelimited("A link change request failed with some changes committed already (err = %i). Interface %s may have been left with an inconsistent configuration, please check.\n",
> +                                            dev->name, err);
> 
> I get different results each time.
> 
> -304123904
> -332128256
> 
> Any idea, how that can happen?
> 
Instead of %i you should use %d, and the order of arguments needs to be reversed.
But this won't help you to find out which call in do_setlink() failed.
There are several occurrences of code like this:

err = fct();
if (err < 0)
	goto errout;

Best change each such occurrence to the following to find out which call failed.

err = fct();
if (err < 0) {
	pr_err("do_setlink: <fct>: err %d\n", err)
	goto errout;
}

> Is there a better way to debug `do_setlink()` in `rtnetlink.c` than
> printf debugging?
> 
Not really
> 
> Kind regards,
> 
> Paul
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ