lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <3927bf4a-0034-a985-6899-d50b7eb8a8a5@iogearbox.net> Date: Fri, 7 Jan 2022 14:56:11 +0100 From: Daniel Borkmann <daniel@...earbox.net> To: Toshiaki Makita <toshiaki.makita1@...il.com> Cc: netdev@...r.kernel.org, laurent.bernaille@...adoghq.com, maciej.fijalkowski@...el.com, eric.dumazet@...il.com, pabeni@...hat.com, john.fastabend@...il.com, willemb@...gle.com, davem@...emloft.net, kuba@...nel.org, magnus.karlsson@...il.com Subject: Re: [PATCH net-next] veth: Do not record rx queue hint in veth_xmit On 1/6/22 1:57 PM, Toshiaki Makita wrote: > On 2022/01/06 9:46, Daniel Borkmann wrote: > > Thank you for the fix. > >> (unless configured in non-default manner). > > Just curious, is there any special configuration to record rx queue index on veth after your patch? Right now skb->queue_mapping represents the tx queue number. So assuming dev->real_num_tx_queues == dev->real_num_rx_queues for the veth creation, then veth_xmit() picks the rx queue via rcv_priv->rq[rxq] based on tx queue number. So, by default veth is created with dev->real_num_tx_queues == dev->real_num_rx_queues == 1, in which case the queue_mapping stays 0. Checking with e.g. ... ip link add foo numtxqueues 64 numrxqueues 64 type veth peer numtxqueues 64 numrxqueues 64 ... then stack inside the netns on the veth dev picks a TX queue via netdev_core_pick_tx(). Due to the skb_record_rx_queue() / skb_get_rx_queue() it is off by one in generic XDP [in hostns] at bpf_prog_run_generic_xdp() for the case where veth has more than single queue (Single queue case netif_get_rxqueue() will see that skb_rx_queue_recorded() is false and just pick the first queue so at least there it's correct). Not sure if there is a good way to detect inside bpf_prog_run_generic_xdp() that skb was looped to host from netns and then fix up the offset .. other option could potentially be an extra device parameter and when enabled only then do the skb_record_rx_queue() so it's an explicit opt-in where admin needs to be aware of potential implications. My worry is that if the skb ends up being pushed out the phys dev, then 1) if veth was provisioned with >1 TX queues the admin must also take into account the TX queues e.g. on phys devices like ena, so that we're not under- utilizing it (or have something like BPF clear the queue mapping before it's forwarded). And if we do skb_record_rx_queue() and, say, TX queue numbers of veth and phys dev are provisioned to be the same, then we jump +1 on the phys dev with a skb_record_rx_queue() which may be unintentional. Hence maybe explicit opt-in could be better option? Thanks, Daniel
Powered by blists - more mailing lists