[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <328c71e7-17c7-40f4-83b3-f0b8b40f4730@soulik.info>
Date: Fri, 2 Aug 2024 03:52:57 +0800
From: Randy Li <ayaka@...lik.info>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, jasowang@...hat.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: tuntap: add ioctl() TUNGETQUEUEINDX to fetch queue
index
On 2024/8/1 22:17, Willem de Bruijn wrote:
> Randy Li wrote:
>> On 2024/8/1 21:04, Willem de Bruijn wrote:
>>> Randy Li wrote:
>>>> On 2024/8/1 05:57, Willem de Bruijn wrote:
>>>>> nits:
>>>>>
>>>>> - INDX->INDEX. It's correct in the code
>>>>> - prefix networking patches with the target tree: PATCH net-next
>>>> I see.
>>>>> Randy Li wrote:
>>>>>> On 2024/7/31 22:12, Willem de Bruijn wrote:
>>>>>>> Randy Li wrote:
>>>>>>>> We need the queue index in qdisc mapping rule. There is no way to
>>>>>>>> fetch that.
>>>>>>> In which command exactly?
>>>>>> That is for sch_multiq, here is an example
>>>>>>
>>>>>> tc qdisc add dev tun0 root handle 1: multiq
>>>>>>
>>>>>> tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
>>>>>> 172.16.10.1 action skbedit queue_mapping 0
>>>>>> tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
>>>>>> 172.16.10.20 action skbedit queue_mapping 1
>>>>>>
>>>>>> tc filter add dev tun0 parent 1: protocol ip prio 1 u32 match ip dst
>>>>>> 172.16.10.10 action skbedit queue_mapping 2
>>>>> If using an IFF_MULTI_QUEUE tun device, packets are automatically
>>>>> load balanced across the multiple queues, in tun_select_queue.
>>>>>
>>>>> If you want more explicit queue selection than by rxhash, tun
>>>>> supports TUNSETSTEERINGEBPF.
>>>> I know this eBPF thing. But I am newbie to eBPF as well I didn't figure
>>>> out how to config eBPF dynamically.
>>> Lack of experience with an existing interface is insufficient reason
>>> to introduce another interface, of course.
>> tc(8) was old interfaces but doesn't have the sufficient info here to
>> complete its work.
> tc is maintained.
>
>> I think eBPF didn't work in all the platforms? JIT doesn't sound like a
>> good solution for embeded platform.
>>
>> Some VPS providers doesn't offer new enough kernel supporting eBPF is
>> another problem here, it is far more easy that just patching an old
>> kernel with this.
> We don't add duplicative features because they are easier to
> cherry-pick to old kernels.
I was trying to say the tc(8) or netlink solution sound more suitable
for general deploying.
>> Anyway, I would learn into it while I would still send out the v2 of
>> this patch. I would figure out whether eBPF could solve all the problem
>> here.
> Most importantly, why do you need a fixed mapping of IP address to
> queue? Can you explain why relying on the standard rx_hash based
> mapping is not sufficient for your workload?
Server
|
|------ tun subnet (e.x. 172.16.10.0/24) ------- peer A (172.16.10.1)
|------ peer B (172.16.10.3)
|------ peer C (172.16.10.20)
I am not even sure the rx_hash could work here, the server here acts as
a router or gateway, I don't know how to filter the connection from the
external interface based on rx_hash. Besides, VPN application didn't
operate on the socket() itself.
I think this question is about why I do the filter in the kernel not the
userspace?
It would be much more easy to the dispatch work in kernel, I only need
to watch the established peer with the help of epoll(). Kernel could
drop all the unwanted packets. Besides, if I do the filter/dispatcher
work in the userspace, it would need to copy the packet's data to the
userspace first, even decide its fate by reading a few bytes from its
beginning offset. I think we can avoid such a cost.
Powered by blists - more mailing lists