netdev - RE: Re: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on MicroBlaze: Packets only received after some buffer is full

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <BL3PR12MB6571A601D219953BE13E1CC7C9B52@BL3PR12MB6571.namprd12.prod.outlook.com>
Date: Tue, 8 Apr 2025 13:51:43 +0000
From: "Gupta, Suraj" <Suraj.Gupta2@....com>
To: Álvaro G. M. <alvaro.gamez@...ent.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, Jakub Kicinski
	<kuba@...nel.org>, "Pandey, Radhey Shyam" <radhey.shyam.pandey@....com>,
	"Katakam, Harini" <harini.katakam@....com>
Subject: RE: Re: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on
 MicroBlaze: Packets only received after some buffer is full

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Álvaro G. M. <alvaro.gamez@...ent.com>
> Sent: Friday, April 4, 2025 11:08 AM
> To: netdev@...r.kernel.org; Jakub Kicinski <kuba@...nel.org>; Pandey, Radhey
> Shyam <radhey.shyam.pandey@....com>
> Subject: Fwd: Re: Issue with AMD Xilinx AXI Ethernet (xilinx_axienet) on
> MicroBlaze: Packets only received after some buffer is full
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> Sorry, I'm resending this email in text-only mode, I sent HTML version and got
> rejected by the list.
>
> Hi Suraj,
>
> On Thu, 2025-04-03 at 13:58 +0000, Gupta, Suraj wrote:
> > >
> > > If I remove "dmas" entry and provide a "axistream-connected" one,
> > > things get a little better (but see at the end for some DTS notes).
> > > In this mode, in which dmaengine is not used but legacy DMA code
> > > inside axienet itself, tcpdump -vv shows packets incoming at a normal rate.
> However, the system is not answering to ARP requests:
> > >
> > Could you please check ifconfig for any packet drop/error?
>
> In all three cases (using old dma style, using dmaengine with default values and
> using dmaengine with buffers set to 128) the behavior is the same:
>
> After a few udp packets:
>
> eth0      Link encap:Ethernet  HWaddr 06:00:0A:BC:8C:02
>           inet addr:10.188.140.2  Bcast:10.188.143.255  Mask:255.255.248.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:213 errors:0 dropped:81 overruns:0 frame:0
>           TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:95233 (93.0 KiB)  TX bytes:738 (738.0 B)
>
> After manually adding  ARP entries and running a short run of iperf3:
>
> # iperf3 -c 10.188.139.1
> Connecting to host 10.188.139.1, port 5201 [  5] local 10.188.140.2 port 54004
> connected to 10.188.139.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.01   sec  3.38 MBytes  28.2 Mbits/sec    0    133 KBytes
> [  5]   1.01-2.00   sec  3.75 MBytes  31.5 Mbits/sec    0    133 KBytes
> [  5]   2.00-3.01   sec  3.75 MBytes  31.4 Mbits/sec    0    133 KBytes
> [  5]   3.01-4.01   sec  3.63 MBytes  30.4 Mbits/sec    0    133 KBytes
> [  5]   4.01-5.00   sec  3.75 MBytes  31.6 Mbits/sec    0    133 KBytes
> [  5]   5.00-6.00   sec  3.63 MBytes  30.4 Mbits/sec    0    133 KBytes
> [  5]   6.00-7.00   sec  3.75 MBytes  31.5 Mbits/sec    0    133 KBytes
> [  5]   7.00-8.01   sec  3.63 MBytes  30.2 Mbits/sec    0    133 KBytes
> [  5]   8.01-9.01   sec  3.63 MBytes  30.4 Mbits/sec    0    133 KBytes
>
> ^C[  5]   9.01-46.69  sec  4.50 MBytes  1.00 Mbits/sec    0    133 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-46.69  sec  37.5 MBytes  6.74 Mbits/sec    0            sender
> [  5]   0.00-46.69  sec  0.00 Bytes  0.00 bits/sec                  receiver
> iperf3: interrupt - the client has terminated # ifconfig
> eth0      Link encap:Ethernet  HWaddr 06:00:0A:BC:8C:02
>           inet addr:10.188.140.2  Bcast:10.188.143.255  Mask:255.255.248.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:14121 errors:0 dropped:106 overruns:0 frame:0
>           TX packets:27360 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:1015380 (991.5 KiB)  TX bytes:41127297 (39.2 MiB)
>
> The number of RX and dropped packets (81/213 vs 106/14121) doesn't seem
> proportional to the number of received packets.
>
> I've been able to gather that dropped packets increase by 1 with each arping that I
> send to the microblaze, but
> *only* if tcpdump is *not* running.
>
> So, I can run tcpdump -vv, send quite a lot of arping, I see them all on screen and
> dropped number of packets do not increase.
> Once I stop tcpdump, I send a single arping and tx dropped packets increase by 1,
> every time.
>
Please check MAC registers dump if packet is dropped there.

>
> > > On the other hand, and since I don't know how to debug this ARP
> > > issue, I went back to see if I could diagnose what's happening in
> > > DMA Engine mode, so I peeked at the code and I saw an asymmetry
> > > between RX and TX, which sounded good given that in dmaengine mode TX
> works perfectly (or so it seems) and RX is heavily buffered.
> > > This asymmetry lies precisely on the number of SG blocks and number
> > > of skb buffers.
> > >
>
> > > I couldn't see what was wrong with new code, so I just went and
> > > replaced the RX_BD_NUM_DEFAULT value from 1024 down to 128, so it's
> > > now the same size as its TX counterpart, but the kernel segfaulted
> > > again when trying to measure throughput. Sadly, my kernel debugging
> > > abilities are not much stronger than this, so I'm stuck at this
> > > point but firmly believe there's something wrong here, although I can't see what
> it is.
> > >
> > > Any help will be greatly appreciated.
> > >
> > This doesn't looks like be the reason as driver doesn't uses lp->rx_bd_num  and
> lp->tx_bd_num to traverse skb ring in DMAengine flow. It uses
> axienet_get_rx_desc() and axienet_get_tx_desc() respectively, which uses same
> size as allocated.
> > Only difference between working and non-working I can see is increasing Rx skb
> ring size. But later you mentioned you tried to bring it down to 128, could you please
> confirm small size transfer still works?
>
> Setting this number to a low value seems to solve the buffering issue, but not the
> thing about missed ARP packets.
>
>
> > FYI, basic ping and iperf both works for us in DMAengine flow for AXI ethernet 1G
> designs. We tested for full-duplex mode. But I can see half duplex in your case,
> could you please confirm if that is expected and correct?
>
> Our connection is via a fiber SFP  (should've mentioned that earlier, sorry) or with
> an cabled SFP (which I am using right now), through dp83620, which in this mode
> does not support autonegotiation but as far as I know init should support full duplex,
> I'll check it out and reach back.
>
> Now that you noticed this, I can tell you that kernel 4.4.43 reported full duplex, and
> kernel 6.13 reports full duplex only in dmaengine mode, in old dma mode it reports
> half duplex.
>
> Old kernel:
> # ethtool eth0
> Settings for eth0:
>         Supported ports: [ TP MII ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>         Supported pause frame use: No
>         Supports auto-negotiation: Yes
>         Advertised link modes:  100baseT/Full
>         Advertised pause frame use: No
>         Advertised auto-negotiation: No
>         Speed: 100Mb/s
>         Duplex: Full
>         Port: MII
>         PHYAD: 1
>         Transceiver: external
>         Auto-negotiation: off
>         Link detected: yes
>
>
> Configuring network: xilinx_axienet 40c00000.ethernet eth0: PHY [axienet-
> 40c00000:01] driver [TI DP83620 10/100 Mbps PHY] (irq=POLL) xilinx_axienet
> 40c00000.ethernet eth0: configuring for phy/mii link mode xilinx_axienet
> 40c00000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off
>
Strange, Duplex mode should be same irrespective of DMAengine or legacy flow. Could you please confirm if it's showing half duplex for legacy flow in ethtool output also?

>
> Thanks, best regards,
>
> --
> Álvaro G. M.
>