lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66b62df442a85_3bec1229461@willemb.c.googlers.com.notmuch>
Date: Fri, 09 Aug 2024 10:55:48 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: ayaka <ayaka@...lik.info>, 
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Jason Wang <jasowang@...hat.com>, 
 netdev@...r.kernel.org, 
 davem@...emloft.net, 
 edumazet@...gle.com, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: tuntap: add ioctl() TUNGETQUEUEINDX to fetch queue
 index

ayaka wrote:
> 
> Sent from my iPad

Try to avoid ^^^
 
> > On Aug 9, 2024, at 2:49 AM, Willem de Bruijn <willemdebruijn.kernel@...il.com> wrote:
> > 
> > 
> >> 
> >>> So I guess an application that owns all the queues could keep track of
> >>> the queue-id to FD mapping. But it is not trivial, nor defined ABI
> >>> behavior.
> >>> 
> >>> Querying the queue_id as in the proposed patch might not solve the
> >>> challenge, though. Since an FD's queue-id may change simply because
> >> Yes, when I asked about those eBPF thing, I thought I don’t need the queue id in those ebpf. It turns out a misunderstanding.
> >> Do we all agree that no matter which filter or steering method we used here, we need a method to query queue index assigned with a fd?
> > 
> > That depends how you intend to use it. And in particular how to work
> > around the issue of IDs not being stable. Without solving that, it
> > seems like an impractical and even dangerous -because easy to misuse-
> > interface.
> > 
> First of all, I need to figure out when the steering action happens.
> When I use multiq qdisc with skbedit, does it happens after the net_device_ops->ndo_select_queue() ?
> If it did, that will still generate unused rxhash and txhash and flow tracking. It sounds a big overhead.
> Is it the same path for tc-bpf solution ?

TC egress is called early in __dev_queue_xmit, the main entry point for
transmission, in sch_handle_egress.

A few lines below netdev_core_pick_tx selects the txq by setting
skb->queue_mapping. Either through netdev_pick_tx or through a device
specific callback ndo_select_queue if it exists.

For tun, tun_select_queue implements that callback. If
TUNSETSTEERINGEBPF is configured, then the BPF program is called. Else
it uses its own rx_hash based approach in tun_automq_select_queue.

There is a special case in between. If TC egress ran skbedit, then
this sets current->net_xmit.skip_txqueue. Which will read the txq
from the skb->queue_mapping set by skbedit, and skip netdev_pick_tx.

That seems more roundabout than I had expected. I thought the code
would just check whether skb->queue_mapping is set and if so skip
netdev_pick_tx.

I wonder if this now means that setting queue_mapping with any other
TC action than skbedit now gets ignored. Importantly, cls_bpf or
act_bpf.

> I would reply with my concern about violating IDs in your last question.
> >>> another queue was detached. So this would have to be queried on each
> >>> detach.
> >>> 
> >> Thank you Jason. That is why I mentioned I may need to submit another patch to bind the queue index with a flow.
> >> 
> >> I think here is a good chance to discuss about this.
> >> I think from the design, the number of queue was a fixed number in those hardware devices? Also for those remote processor type wireless device(I think those are the modem devices).
> >> The way invoked with hash in every packet could consume lots of CPU times. And it is not necessary to track every packet.
> > 
> > rxhash based steering is common. There needs to be a strong(er) reason
> > to implement an alternative.
> > 
> I have a few questions about this hash steering, which didn’t request any future filter invoked:
> 1. If a flow happens before wrote to the tun, how to filter it?

What do you mean?

> 2. Does such a hash operation happen to every packet passing through?

For packets with a local socket, the computation is cached in the
socket.

For these tunnel packets, see tun_automq_select_queue. Specifically,
the call to __skb_get_hash_symmetric.

I'm actually not entirely sure why tun has this, rather than defer
to netdev_pick_tx, which call skb_tx_hash.

> 3. Is rxhash based on the flow tracking record in the tun driver?
> Those CPU overhead may demolish the benefit of the multiple queues and filters in the kernel solution.

Keyword is "may". Avoid premature optimization in favor of data.

> Also the flow tracking has a limited to 4096 or 1024, for a IPv4 /24 subnet, if everyone opened 16 websites, are we run out of memory before some entries expired?
> 
> I want to  seek there is a modern way to implement VPN in Linux after so many features has been introduced to Linux. So far, I don’t find a proper way to make any advantage here than other platforms.
> >> Could I add another property in struct tun_file and steering program return wanted value. Then it is application’s work to keep this new property unique.
> > 
> > I don't entirely follow this suggestion?
> > 
> >>> I suppose one underlying question is how important is the mapping of
> >>> flows to specific queue-id's? Is it a problem if the destination queue
> >>> for a flow changes mid-stream?
> >> Yes, it matters. Or why I want to use this feature. From all the open source VPN I know, neither enabled this multiqueu feature nor create more than one queue for it.
> >> And virtual machine would use the tap at the most time(they want to emulate a real nic).
> >> So basically this multiple queue feature was kind of useless for the VPN usage.
> >> If the filter can’t work atomically here, which would lead to unwanted packets transmitted to the wrong thread.
> > 
> > What exactly is the issue if a flow migrates from one queue to
> > another? There may be some OOO arrival. But these configuration
> > changes are rare events.
> I don’t know what the OOO means here.

Out of order.

> If a flow would migrate from its supposed queue to another, that was against the pretension to use the multiple queues here.
> A queue presents a VPN node here. It means it would leak one’s data to the other.
> Also those data could be just garbage fragments costs bandwidth sending to a peer that can’t handle it.

MultiQ is normally just a scalability optimization. It does not matter
for correctness, bar the possibility of brief packet reordering when a
flow switches queues.

I now get that what you are trying to do is set up a 1:1 relationship
between VPN connections and multi queue tun FDs. What would be reading
these FDs? If a single process, then it can definitely handle flow
migration.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ