lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 Oct 2022 10:23:57 +0800
From:   Heng Qi <hengqi@...ux.alibaba.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc:     "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH net] veth: Avoid drop packets when xdp_redirect performs



在 2022/9/29 下午8:08, Toke Høiland-Jørgensen 写道:
> Heng Qi <hengqi@...ux.alibaba.com> writes:
>
>>>> As I said above in the real case, the user's concern is not why the performance
>>>> of xdp becomes bad, but why the data packets are not received.
>>> Well, that arguably tells the end-user there is something wrong in
>>> their setup. On the flip side, having a functionally working setup with
>>> horrible performances would likely lead the users (perhaps not yours,
>>> surely others) in very wrong directions (from "XDP is slow" to "the
>>> problem is in the application")...
>>>
>>>> The default number of veth queues is not num_possible_cpus(). When GRO is enabled
>>>> by default, if there is only one veth queue, but multiple CPUs read and write at the
>>>> same time, the efficiency of napi is actually very low due to the existence of
>>>> production locks and races. On the contrary, the default veth_xmit() each cpu has
>>>> its own unique queue, and this way of sending and receiving packets is also efficient.
>>>>
>>> This patch adds a bit of complexity and it looks completely avoidable
>>> with some configuration - you could enable GRO and set the number of
>>> queues to num_possible_cpus().
>>>
>>> I agree with Toke, you should explain the end-users that their
>>> expecations are wrong, and guide them towards a better setup.
>>>
>>> Thanks!
>> Well, one thing I want to know is that in the following scenario,
>>
>> NIC   ->   veth0----veth1
>>    |           |        |
>> (XDP)      (XDP)    (no XDP)
>>
>> xdp_redirect is triggered,
>> and NIC and veth0 are both mounted with the xdp program, then why our default behavior
>> is to drop packets that should be sent to veth1 instead of when veth0 is mounted with xdp
>> program, the napi ring of veth1 is opened by default at the same time? Why not make it like
>> this, but we must configure a simple xdp program on veth1?
> As I said in my initial reply, you don't actually need to load an XDP
> program (anymore), it's enough to enable GRO through ethtool on both
> peers. You can easily do this on setup if you know XDP is going to be
> used in your environment.

This does serve our purpose, but in fact, users of veth pair do not necessarily understand
how it works. In this case, for users who are not familiar with veth pair, they may not
know that they need to enable GRO or load the xdp program for peer veth to ensure that
the redirected packets can be received smoothly. In order to solve this overwhelming problem,
they may go to take the time to look at the source code, or even find someone else to solve it,
but we can avoid these with simple modifications (modifications may not be the rollback,
using the backlog instead of the napi ring, made by this patch), for example, maybe we should
consider a simpler method: when loading xdp in veth, we can automatically enable the napi ring
of peer veth, which seems to have no performance impact and functional impact on the veth pair,
and no longer requires users to do more things for peer veth (after all, they may be unaware
of more requirements for peer veth). Do you think this is feasible?

Thanks.

> -Toke

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ