lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e456c4d-aa22-4e7f-9b2c-3059fe840cb9@linux.alibaba.com>
Date:   Tue, 15 Feb 2022 14:38:51 +0800
From:   Heyi Guo <guoheyi@...ux.alibaba.com>
To:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Joel Stanley <joel@....id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Dylan Hung <dylan_hung@...eedtech.com>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [Issue report] drivers/ftgmac100: DHCP occasionally fails during boot
 up or link down/up

Hi,

We are using Aspeed 2600 and found DHCP occasionally fails during boot 
up or link down/up. The DHCP client is systemd 247.6 networkd. Our 
network device is 2600 MAC4 connected to a RGMII PHY module.

Current investigation shows the first DHCP discovery packet sent by 
systemd-networkd might be corrupted, and sysmtemd-networkd will continue 
to send DHCP discovery packets with the same XID, but no other packets, 
as there is no IP obtained at the moment. However the server side will 
not respond with this serial of DHCP requests, until it receives some 
other packets. This situation can be recovered by another link down/up, 
or a "ping -I eth0 xxx.xxx.xxx.xxx" command to insert some other TX packets.

Navigating the driver code ftgmac.c, I've some question about the work 
flow from link down to link up. I think the flow is as below:

1. ftgmac100_open() will enable net interface with ftgmac100_init_all(), 
and then call phy_start()

2. When PHY is link up, it will call netif_carrier_on() and then 
adjust_link interface, which is ftgmac100_adjust_link() for ftgmac100

3. In ftgmac100_adjust_link(), it will schedule the reset work 
(ftgmac100_reset_task)

4. ftgmac100_reset_task() will then reset the MAC

I found networkd will start to send DHCP request immediately after 
netif_carrier_on() called in step 2, but step 4 will reset the MAC, 
which may potentially corrupt the sending packet.

Is there anything wrong in this flow? Or do I miss something?

Thanks,

Heyi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ