netdev - RE: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on MicroBlaze: Packets only received after some buffer is full

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <MN0PR12MB59537EB05F07459513A2301EB7AE2@MN0PR12MB5953.namprd12.prod.outlook.com>
Date: Thu, 3 Apr 2025 05:54:05 +0000
From: "Pandey, Radhey Shyam" <radhey.shyam.pandey@....com>
To: Álvaro G. M. <alvaro.gamez@...ent.com>, Jakub Kicinski
	<kuba@...nel.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Katakam, Harini"
	<harini.katakam@....com>, "Gupta, Suraj" <Suraj.Gupta2@....com>
Subject: RE: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on
 MicroBlaze: Packets only received after some buffer is full

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Álvaro G. M. <alvaro.gamez@...ent.com>
> Sent: Thursday, April 3, 2025 11:15 AM
> To: Jakub Kicinski <kuba@...nel.org>
> Cc: netdev@...r.kernel.org; Pandey, Radhey Shyam
> <radhey.shyam.pandey@....com>
> Subject: Re: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on MicroBlaze:
> Packets only received after some buffer is full
>
> Hi
>
>
> On Wed, 2025-04-02 at 10:00 -0700, Jakub Kicinski wrote:
> > +CC Radhey, maintainer of axienet
>
> Thanks, I don't know why I didn't think of that.
>
> So, I can provide a little more information and I definitely believe now there are some
> issues with this driver.
>
> > On Tue, 01 Apr 2025 12:52:15 +0200 Álvaro "G. M." wrote:
> > > I guess I may have made some mistake in upgrading the DTS to the new
> > > format, although I've tried the two available methods (either setting node "dmas"
> or using "axistream-connected"
> > > property) and both methods result in the same boot messages and behavior.
>
> This has happened not to be true, I'm sorry for the confusion. Using node "dmas"
> enables use_dmaengine and produces the effect I explained: data is only received
> after a 2^17 bytes buffer is filled.
>
> If I remove "dmas" entry and provide a "axistream-connected" one, things get a little
> better (but see at the end for some DTS notes). In this mode, in which dmaengine is
> not used but legacy DMA code inside axienet itself, tcpdump -vv shows packets
> incoming at a normal rate. However, the system is not answering to ARP requests:
>
> 00:02:37.800814 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:38.801974 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:39.804137 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:40.806434 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:41.808084 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:42.810592 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
> 00:02:43.813155 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.2
> tell 10.188.139.1, length 46
>
> Here's the normal answer for a second device running old 4.4.43 kernel connected to
> the same switch:
>
> 00:21:12.057326 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.1
> tell 10.188.139.1, length 46
> 00:21:12.057905 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.188.140.1 is-at
> 06:00:0a:bc:8c:01 (oui Unknown), length 28
> 00:21:13.059460 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.1
> tell 10.188.139.1, length 46
> 00:21:13.060031 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.188.140.1 is-at
> 06:00:0a:bc:8c:01 (oui Unknown), length 28
> 00:21:14.060502 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.188.140.1
> tell 10.188.139.1, length 46
> 00:21:14.061051 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.188.140.1 is-at
> 06:00:0a:bc:8c:01 (oui Unknown), length 28
>
> The funny thing is that once I manually add arp entries in both my computer and the
> embedded one, I can establish full TCP communication and iperf3 shows a relatively
> nice speed, similar to the throughput I get with old 4.4.43 kernel.
>
> # arp -s 10.188.139.1 f4:4d:ad:02:11:29
> # iperf3 -c 10.188.139.1
> Connecting to host 10.188.139.1, port 5201 [  5] local 10.188.140.2 port 55480
> connected to 10.188.139.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.01   sec  3.63 MBytes  30.1 Mbits/sec    0    130 KBytes
> [  5]   1.01-2.01   sec  3.75 MBytes  31.5 Mbits/sec    0    130 KBytes
> [  5]   2.01-3.01   sec  3.63 MBytes  30.4 Mbits/sec    0    130 KBytes
> [  5]   3.01-4.01   sec  3.75 MBytes  31.4 Mbits/sec    0    130 KBytes
> [  5]   4.01-5.01   sec  3.75 MBytes  31.4 Mbits/sec    0    130 KBytes
> [  5]   5.01-6.01   sec  3.75 MBytes  31.5 Mbits/sec    0    130 KBytes
> [  5]   6.01-7.01   sec  3.75 MBytes  31.6 Mbits/sec    0    130 KBytes
> [  5]   7.01-7.75   sec  2.63 MBytes  29.5 Mbits/sec    0    130 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-7.75   sec  28.6 MBytes  31.0 Mbits/sec    0            sender
> [  5]   0.00-7.75   sec  0.00 Bytes  0.00 bits/sec                  receiver
> iperf3: interrupt - the client has terminated # iperf3 -c 10.188.139.1 -R Connecting to
> host 10.188.139.1, port 5201 Reverse mode, remote host 10.188.139.1 is sending [
> 5] local 10.188.140.2 port 45582 connected to 10.188.139.1 port 5201
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-1.03   sec  5.13 MBytes  41.9 Mbits/sec
> [  5]   1.03-2.03   sec  5.38 MBytes  44.8 Mbits/sec
> [  5]   2.03-3.02   sec  5.38 MBytes  45.6 Mbits/sec
> [  5]   3.02-4.02   sec  5.38 MBytes  45.2 Mbits/sec
> [  5]   4.02-5.01   sec  5.38 MBytes  45.4 Mbits/sec
> [  5]   5.01-5.30   sec  1.50 MBytes  43.2 Mbits/sec
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate
> [  5]   0.00-5.30   sec  0.00 Bytes  0.00 bits/sec                  sender
> [  5]   0.00-5.30   sec  28.1 MBytes  44.5 Mbits/sec                  receiver
> iperf3: interrupt - the client has terminated
>
> I had never seen a device able to fully stablish communication except for replying to
> MAC requests, so I'm not sure what's happening here.
>
>
> On the other hand, and since I don't know how to debug this ARP issue, I went back
> to see if I could diagnose what's happening in DMA Engine mode, so I peeked at the
> code and I saw an asymmetry between RX and TX, which sounded good given that
> in dmaengine mode TX works perfectly (or so it seems) and RX is heavily buffered.
> This asymmetry lies precisely on the number of SG blocks and number of skb
> buffers.
>
> Both bd_nums are defined in the same way:
>         lp->rx_bd_num = RX_BD_NUM_DEFAULT; // = 1024
>         lp->tx_bd_num = TX_BD_NUM_DEFAULT; // = 128
>
>
> But the skb ring size is defined in a different fashion:
>         lp->tx_skb_ring = kcalloc(TX_BD_NUM_MAX, sizeof(*lp->tx_skb_ring), // =
> 4096
>                                   GFP_KERNEL);
>       ...
>         lp->rx_skb_ring = kcalloc(RX_BUF_NUM_DEFAULT, sizeof(*lp->rx_skb_ring),
> // = 128
>                                   GFP_KERNEL);
>
> So, for TX we allocate space for up to 4096 buffers but by default use 128.
> For RX we allocate space for 128 buffers but somehow are setting 1024 as the
> default bd number.
>
> The fact that RX_BD_NUM_DEFAULT is used nowhere else is also a signal that
> there was some mistake here, so I went and replaced all RX_BUF_NUM_DEFAULT
> occurances with RX_BD_NUM_DEFAULT, so that both TX and RX skb rings are
> declared and operated with using the same strategy:
>
>   sed -i '/^#define/!s#RX_BUF_NUM_DEFAULT#RX_BD_NUM_MAX#g'
> xilinx_axienet_main.c
>
> Doing this solved the buffering problem, although the system still doesn't reply to
> ARP requests, and when I tried to run an iperf3 test after manually adding arp tables,
> the kernel segfaulted (so I probably shouldn't have blindly 'sed' like that :)
>
> # iperf3 -c 10.188.139.1
> Connecting to host 10.188.139.1, port 5201 [  5] local 10.188.140.2 port 46356
> connected to 10.188.139.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.01   sec   640 KBytes  5.18 Mbits/sec    3   84.8 KBykernel task_size
> exceed
> Oops: Exception in kernel mode, sig: 11
> CPU: 0 UID: 0 PID: 147 Comm: iperf3 Not tainted 6.13.8 #13  Registers dump:
> mode=8269B900  r1=00000000, r2=00000000, r3=00000000, r4=00000010
> r5=00000000, r6=000005F2, r7=FFFF7FFF, r8=00000000  r9=00000000,
> r10=00000000, r11=00000000, r12=CF5FF24C  r13=00000000, r14=C241AB70,
> r15=C0383EB8, r16=00000000  r17=C0383EC0, r18=000005F0, r19=C10124A0,
> r20=480F8520  r21=4831F960, r22=00000000, r23=00000000, r24=FFFFFFEA
> r25=C12BE0A8, r26=C12BE03C, r27=C12BE020, r28=00000122  r29=00000100,
> r30=000065A2, r31=C120F780, rPC=C0383EC0  msr=000046A2, ear=FFFFFFFA,
> esr=00000312, fsr=00000000 Kernel panic - not syncing: Aiee, killing interrupt
> handler!
> ---[ end Kernel panic - not syncing: Aiee, killing interrupt handler! ]---
> tes
>
> I couldn't see what was wrong with new code, so I just went and replaced the
> RX_BD_NUM_DEFAULT value from 1024 down to 128, so it's now the same size as
> its TX counterpart, but the kernel segfaulted again when trying to measure
> throughput. Sadly, my kernel debugging abilities are not much stronger than this, so
> I'm stuck at this point but firmly believe there's something wrong here, although I
> can't see what it is.
>
> Any help will be greatly appreciated.
>
>
> DTS NOTES:
> Using old DMA code inside xilinx_axienet_main.c requires removing "dmas" entry
> and add a reference to DMA device either via axistream-connected or by adding
> resources manually to the node. Referring to the node linked by axistream-
> connected requires a DMA node to exist, but its compatible string can't be xlnx,axi-
> dma-1.00.a, because then AXI DMA driver will lock onto it and axienet will complain
> about the device being busy. So my solution for this is to use a not compatible string.
> As such, with the following DTS I can establish TCP connections as long as ARP
> tables are manually entered:
>
>
> axi_ethernet_0_dma: dma@...00000 {
>       /* NOTE THE NOT */
>       compatible = "notxlnx,axi-dma-1.00.a";
>       #dma-cells = <1>;
>       reg = <0x41e00000 0x10000>;
>       interrupt-parent = <&microblaze_0_axi_intc>;
>       interrupts = <7 1 8 1>;
>       xlnx,addrwidth = <32>;  // Tamaño de dirección en bits
>       xlnx,datawidth = <32>;
>       xlnx,include-sg;
>       xlnx,sg-length-width = <16>;
>       xlnx,include-dre = <1>;
>       xlnx,axistream-connected = <1>;
>       xlnx,irq-delay = <1>;
>       dma-channels = <2>;
>       clock-names = "s_axi_lite_aclk", "m_axi_mm2s_aclk", "m_axi_s2mm_aclk",
> "m_axi_sg_aclk";
>       clocks = <&clk_bus_0>, <&clk_bus_0>, <&clk_bus_0>, <&clk_bus_0>;
>       dma-channel@...00000 {
>               compatible = "xlnx,axi-dma-mm2s-channel";
>               xlnx,include-dre = <1>;
>               interrupts = <7 1>;
>               xlnx,datawidth = <32>;
>       };
>       dma-channel@...00030 {
>               compatible = "xlnx,axi-dma-s2mm-channel";
>               xlnx,include-dre = <1>;
>               interrupts = <8 1>;
>               xlnx,datawidth = <32>;
>       };
> };
> axi_ethernet_eth: ethernet@...00000 {
>       compatible = "xlnx,axi-ethernet-1.00.a";
>       reg = <0x40c00000 0x40000>;
>       phy-handle = <&phy1>;
>       interrupt-parent = <&microblaze_0_axi_intc>;
>       interrupts = <3 0>;
>       xlnx,rxmem = <0x1000>;
>       max-speed = <100000>;
>       phy-mode = "mii";
>       xlnx,txcsum = <0x2>;
>       xlnx,rxcsum = <0x2>;
>       clock-names = "s_axi_lite_clk", "axis_clk", "ref_clk", "mgt_clk";
>       clocks = <&clk_bus_0>, <&clk_bus_0>, <&clk_bus_0>, <&clk_bus_0>;
>       axistream-connected = <&axi_ethernet_0_dma>;
>       dma-names = "tx_chan0", "rx_chan0";
>       mdio {
>               #address-cells = <1>;
>               #size-cells = <0>;
>               phy1: ethernet-phy@1 {
>                       device_type = "ethernet-phy";
>                       reg = <1>;
>               };
>       };
> };
>
> So this mode of working would definitely NOT need AXI DMA, and this hack with the
> compatible string should not be needed if the dependency with AXI DMA was
> removed.
>
> Best regards,

 + Going through the details and will get back to you . Just to confirm there is no
vivado design update ? and we are only updating linux kernel to latest?

Thanks,
Radhey