lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1233c766-0260-497d-8700-87f0f76d2bcd@lunn.ch>
Date: Mon, 19 Aug 2024 18:25:39 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Shane Francis <bigbeeshane@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, mcoquelin.stm32@...il.com,
	linux-arm-kernel@...ts.infradead.org, netdev@...r.kernel.org
Subject: Re: [BUG] net: stmmac: crash within stmmac_rx()

On Mon, Aug 19, 2024 at 01:26:37PM +0100, Shane Francis wrote:
> Summary of the problem:
> ===================
> Crash observed within stmmac_rx when under high RX demand
> 
> Hardware : Rockchip RK3588 platform with an RTL8211F NIC
> 
> the issue seems identical to the one described here :
> https://lore.kernel.org/netdev/20210514214927.GC1969@qmqm.qmqm.pl/T/
> 
> Full description of the problem/report:
> =============================
> I have observed that when under high upload scenarios the stmmac
> driver will crash due to what I think is an overflow error, after some
> debugging I found that stmmac_rx_buf2_len() is returning an
> unexpectedly high value and assigning to buf2_len here
> https://github.com/torvalds/linux/blob/v6.6/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c#L5466
> 
> an example value set that i have observed to causes the crash :
>     buf1_len = 0
>     buf2_len = 4294966330
> 
> from within the stmmac_rx_buf2_len function
>     plen = 2106
>     len = 3072
> 
> the return value would be plen-len or -966 (4294966330 as a uint32
> that matches the buf2_len)
> 
> I am unsure on how to debug this further, would clamping
> stmmac_rx_buf2_len function to return the dma_buf_sz if the return
> value would have otherwise exceeded it ?

Clamping will just paper over the problem, not fix it. You need to
keep debugging to really understand what the issue is.

Clearly len > plen is a problem, so you could add a BUG_ON(len > plen)
which will give you a stack trace. But i doubt that is very
interesting. You probably want to get into stmmac_get_rx_frame_len()
and see how it calculates plan. stmmac obfustication makes it hard to
say which of:

dwmac4_descs.c:	.get_rx_frame_len = dwmac4_wrback_get_rx_frame_len,
dwxgmac2_descs.c:	.get_rx_frame_len = dwxgmac2_get_rx_frame_len,
enh_desc.c:	.get_rx_frame_len = enh_desc_get_rx_frame_len,
norm_desc.c:	.get_rx_frame_len = ndesc_get_rx_frame_len,

is being used. But they all look pretty similar.

What i find interesting is that both are greater than 1512, a typical
ethernet frame size. Are you using jumbo packets? Is the hardware
doing some sort of GRO?

      Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ