lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 10 Jan 2022 15:52:41 +0900 From: Toshiaki Makita <toshiaki.makita1@...il.com> To: Daniel Borkmann <daniel@...earbox.net> Cc: netdev@...r.kernel.org, laurent.bernaille@...adoghq.com, maciej.fijalkowski@...el.com, eric.dumazet@...il.com, pabeni@...hat.com, john.fastabend@...il.com, willemb@...gle.com, davem@...emloft.net, kuba@...nel.org, magnus.karlsson@...il.com Subject: Re: [PATCH net-next] veth: Do not record rx queue hint in veth_xmit On 2022/01/07 22:56, Daniel Borkmann wrote: > On 1/6/22 1:57 PM, Toshiaki Makita wrote: >> On 2022/01/06 9:46, Daniel Borkmann wrote: >> >> Thank you for the fix. >> >>> (unless configured in non-default manner). >> >> Just curious, is there any special configuration to record rx queue index on veth >> after your patch? > > Right now skb->queue_mapping represents the tx queue number. So assuming > dev->real_num_tx_queues == dev->real_num_rx_queues for the veth creation, > then veth_xmit() picks the rx queue via rcv_priv->rq[rxq] based on tx queue > number. So, by default veth is created with dev->real_num_tx_queues == > dev->real_num_rx_queues == 1, in which case the queue_mapping stays 0. > Checking with e.g. ... > > ip link add foo numtxqueues 64 numrxqueues 64 type veth peer numtxqueues 64 > numrxqueues 64 > > ... then stack inside the netns on the veth dev picks a TX queue via > netdev_core_pick_tx(). Due to the skb_record_rx_queue() / skb_get_rx_queue() > it is off by one in generic XDP [in hostns] at bpf_prog_run_generic_xdp() for > the case where veth has more than single queue (Single queue case netif_get_rxqueue() > will see that skb_rx_queue_recorded() is false and just pick the first queue > so at least there it's correct). > > Not sure if there is a good way to detect inside bpf_prog_run_generic_xdp() > that skb was looped to host from netns and then fix up the offset .. other > option could potentially be an extra device parameter and when enabled only > then do the skb_record_rx_queue() so it's an explicit opt-in where admin needs > to be aware of potential implications. > > My worry is that if the skb ends up being pushed out the phys dev, then 1) > if veth was provisioned with >1 TX queues the admin must also take into > account the TX queues e.g. on phys devices like ena, so that we're not under- > utilizing it (or have something like BPF clear the queue mapping before > it's forwarded). And if we do skb_record_rx_queue() and, say, TX queue numbers > of veth and phys dev are provisioned to be the same, then we jump +1 on the > phys dev with a skb_record_rx_queue() which may be unintentional. Hence maybe > explicit opt-in could be better option? Thank you for the detailed explanation! I agree with you on that an opt-in would be better. I thought you might mean there is already some way to do that by "configured in non-default manner", so I asked it. Now I understand we need an additional knob if we really want to extract the veth rx queue number from bpf. Toshiaki Makita
Powered by blists - more mailing lists