[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24a12884-415c-43ce-8353-cc92af1e7aa1@linux.alibaba.com>
Date: Mon, 3 Jun 2024 23:07:49 +0800
From: "D. Wythe" <alibuda@...ux.alibaba.com>
To: Niklas Schnelle <schnelle@...ux.ibm.com>, kgraul@...ux.ibm.com,
wenjia@...ux.ibm.com, jaka@...ux.ibm.com, wintera@...ux.ibm.com,
guwen@...ux.alibaba.com
Cc: kuba@...nel.org, davem@...emloft.net, netdev@...r.kernel.org,
linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org,
tonylu@...ux.alibaba.com, pabeni@...hat.com, edumazet@...gle.com
Subject: Re: [PATCH net-next v5 0/3] Introduce IPPROTO_SMC
On 6/3/24 3:48 PM, Niklas Schnelle wrote:
> On Thu, 2024-05-30 at 18:14 +0800, D. Wythe wrote:
>> On 5/30/24 5:30 PM, D. Wythe wrote:
>>> From: "D. Wythe" <alibuda@...ux.alibaba.com>
>>>
>>> This patch allows to create smc socket via AF_INET,
>>> similar to the following code,
>>>
>>> /* create v4 smc sock */
>>> v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);
>>>
>>> /* create v6 smc sock */
>>> v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);
>> Welcome everyone to try out the eBPF based version of smc_run during
>> testing, I have added a separate command called smc_run.bpf,
>> it was equivalent to normal smc_run but with IPPROTO_SMC via eBPF.
>>
>> You can obtain the code and more info from:
>> https://github.com/D-Wythe/smc-tools
>>
>> Usage:
>>
>> smc_run.bpf
>> An eBPF implemented smc_run based on IPPROTO_SMC:
>>
>> 1. Support to transparent replacement based on command (Just like smc_run).
>> 2. Supprot to transparent replacement based on pid configuration. And
>> supports the inheritance of this capability between parent and child
>> processes.
>> 3. Support to transparent replacement based on per netns configuration.
>>
>> smc_run.bpf COMMAND
>>
>> 1. Equivalent to smc_run but with IPPROTO_SMC via eBPF
>>
>> smc_run.bpf -p pid
>>
>> 1. Add the process with target pid to the map. Afterward, all socket()
>> calls of the process and its descendant processes will be replaced from
>> IPPROTO_TCP to IPPROTO_SMC.
>> 2. Mapping will be automatically deleted when process exits.
>> 3. Specifically, COMMAND mode is actually works like following:
>>
>> smc_run.bpf -p $$
>> COMMAND
>> exit
>>
>> smc_run.bpf -n 1
>>
>> 1. Make all socket() calls of the current netns to be replaced from
>> IPPROTO_TCP to IPPROTO_SMC.
>> 2. Turn off it by smc_run.bpf -n 0
>>
>>
> Hi D. Wythe,
>
> I gave this series plus your smc_run.bpf and SMC_LO based SMC-D a test
> run on my Ryzen 3900X workstation and I have to say I'm quite
> impressed. I first tried the SMC_LO feature as merged in v6.10-rc1 with
> the classic LD_PRELOAD based smc_run and iperf3, and qperf …
> tcp_bw/tcp_lat both with normal localhost and between docker
> containers. For this to work I of course had to initially set my UEID
> as x86_64 unlike s390x doesn't get an SEID set. I used the following
> script for this.
>
>
> #!/usr/bin/sh
> machine_id_upper=$(cat /etc/machine-id | tr '[:lower:]' '[:upper:]')
> machine_id_suffix=$(echo "$machine_id_upper" | head -c 27)
> ueid="MID-$machine_id_suffix"
> smcd ueid add "$ueid"
>
>
> The performance is pretty impressive:
> * iperf3 with 12 parallel connections (matching core count) results in
> ~152 Gbit/s on normal loopback and ~312 Gbit/s with SMC_LO.
> * qperf … tcp_bw (single thread) results in ~46 Gbit/s on normal loopback
> and ~58 Gbit/s with SMC_LO
> * qperf … tcp_lat latency test results in 5-9 us with normal loopback
> and around 3-4 us with SMC_LO
>
> Then I applied this series on top of v6.10-rc1 and tried it with your
> smc_run.bpf. The performance is of course in-line with the above but
> thanks to being able to enable SMC on a per-netns basis I was able to
> try a few more thing. First I tried just enabling it in my default
> netns and verified that after restarting sshd new ssh connections to
> localhost used SMC-D through SMC_LO. Then I started Chrome and
> confirmed that its TCP connections also registered with SMC and
> successfully fell back to TCP mode. I had no trouble with normal
> browsing though I guess especially Google stuff often uses HTTP/3 so
> isn't affected. Still nice to see I didn't get breakage.
>
> Secondly I tried smc_run.bpf with docker containers using the following
> trick:
>
> docker inspect --format '{{.State.Pid}}' <my_container_name>
> 34651
> nsenter -t 34651 -n smc_run.bpf -n 1
>
> Sadly this only works for commands started in the container after
> loading the BPF. So I wonder if you know of a good way to either
> automatically execute smc_run.bpf on container start or maybe use it on
> the docker daemon such that all namespaces created by docker get the
> IPPROTO_SMC override. I'd then definitely consider using SMC-D with
> SMC_LO between my home lab containers even if just for bragging rights
> ;-)
>
> Feel free to add for the IPPROTO_SMC series:
>
> Tested-by: Niklas Schnelle <schnelle@...ux.ibm.com>
>
> Thanks,
> Niklas
Hi Niklas ,
Thanks very much for your testing.
Regarding your question, have you ever tried starting the container
using 'smc_run.bpf docker' ?
The smc_run.bpf allows the capability for replacement to be inherited by
descendant processes. This might meet your needs.
However, it should be noted that this scope would no longer be limited
to netns.
If you don't want to replace the docker command and would like to keep
per netns, there are indeed some tricky ways, for example,
we could check current process name when creating new netns to decide if
we should add it to the ebpf-map,
but I think it's not appropriate to include this in smc_run.bpf.
Best wishes,
D. Wythe
Powered by blists - more mailing lists