netdev - Re: mlnx5_core xdp redirect errors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <2c2f2c2a-1d61-49a3-b975-22471fbf259a@amd.com>
Date: Thu, 27 Jun 2024 16:20:09 -0700
From: "Nelson, Shannon" <shannon.nelson@....com>
To: Николай Рыбалов
 <dairinin@...il.com>, netdev@...r.kernel.org
Subject: Re: mlnx5_core xdp redirect errors

On 6/27/2024 2:07 AM, Николай Рыбалов wrote:
>
> Hello,
> 
> I have a setup with 32 cpus and two mlnx5 nics, both running XDP
> programs, one of which does redirect via devmap to another. This works
> fine until the following happens:
> 
> 1. Limit number of queues on both nics to 4 (< number of cpus)
> 2. Place incoming interrupt on a CPU >4 via irq_affinity
> 3. See redirect errors in trace:
>            <idle>-0       [001] ..s1.  2010.232028: xdp_redirect:
> prog_id=58 action=REDIRECT ifindex=5 to_ifindex=4 err=0 map_id=44
> map_index=0
>            <idle>-0       [001] ..s1.  2010.232033: xdp_devmap_xmit:
> ndo_xdp_xmit from_ifindex=5 to_ifindex=4 action=REDIRECT sent=1
> drops=0 err=0
>            <idle>-0       [005] ..s1.  2010.232253: xdp_redirect:
> prog_id=56 action=REDIRECT ifindex=4 to_ifindex=5 err=0 map_id=44
> map_index=1
>            <idle>-0       [005] ..s1.  2010.232257: xdp_devmap_xmit:
> ndo_xdp_xmit from_ifindex=4 to_ifindex=5 action=REDIRECT sent=0
> drops=1 err=-6
> 
> This narrows down to the code in mlx5_xdp_xmit that selects output
> queue by smp cpu id, fails on cpu 5 and succeeds on cpu 1
> The scenario is not very exotic to me, at least there is a need of not
> running nic interrupts on all the cpus in the system, and not to be
> bounded to first N of them.
> Can this issue be solved in the driver, or I should start looking for
> a workaround on the userland side?
> 
> Best regards
> 

This is one of the weaknesses in the XDP_REDIRECT model – how to get a 
packet sent from one interface to another within a safe context. 
Assuming a queue is tied to a specific interrupt and an interrupt is 
tied to a cpu, and both interfaces have the same queue/interrupt/cpu 
mapping, the originating napi context can assume it is safe to drop the 
packet into the target interface’s queue without worrying about added 
time-consuming locking.  This is the model that many of the drivers are 
following, and if the queues/interrupts/cpus don’t match then the packet 
gets dropped.  There are other drivers that try to do some locking and 
queue mapping / modulus operations / whatever to try to accept the 
packet into a non-matching queue id, to varying degrees of success.

The best case scenario seems to be to make sure both interfaces are on 
the same CPU/PCI complex, have the same number of queues, those queues 
are mapped to the same interrupts/cpus, and irqbalance is told to leave 
them alone.

I’m hoping there are others with more experience configuring these that 
will have better answers, but this is what I’ve encountered in my short 
history playing with XDP.

Cheers,
sln