lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241101105253.GG101007@linux.alibaba.com>
Date: Fri, 1 Nov 2024 18:52:53 +0800
From: Dust Li <dust.li@...ux.alibaba.com>
To: liqiang <liqiang64@...wei.com>, wenjia@...ux.ibm.com,
	jaka@...ux.ibm.com, alibuda@...ux.alibaba.com,
	tonylu@...ux.alibaba.com, guwen@...ux.alibaba.com
Cc: linux-s390@...r.kernel.org, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, luanjianhai@...wei.com,
	zhangxuzhou4@...wei.com, dengguangxing@...wei.com,
	gaochao24@...wei.com, kuba@...nel.org
Subject: Re: [PATCH net-next] net/smc: Optimize the search method of reused
 buf_desc

On 2024-11-01 16:23:42, liqiang wrote:
>We create a lock-less link list for the currently 
>idle reusable smc_buf_desc.
>
>When the 'used' filed mark to 0, it is added to 
>the lock-less linked list. 
>
>When a new connection is established, a suitable 
>element is obtained directly, which eliminates the 
>need for traversal and search, and does not require 
>locking resource.
>
>A lock-less linked list is a linked list that uses 
>atomic operations to optimize the producer-consumer model.
>
>I didn't find a suitable public benchmark, so I tested the 
>time-consuming comparison of this function under multiple 
>connections based on redis-benchmark (test in smc loopback-ism mode):

I think you can run test wrk/nginx test with short-lived connection.
For example:

```
# client
wrk -H "Connection: close" http://$serverIp

# server
nginx
```

>
>    1. On the current version:
>        [x.832733] smc_buf_get_slot cost:602 ns, walk 10 buf_descs
>        [x.832860] smc_buf_get_slot cost:329 ns, walk 12 buf_descs
>        [x.832999] smc_buf_get_slot cost:479 ns, walk 17 buf_descs
>        [x.833157] smc_buf_get_slot cost:679 ns, walk 13 buf_descs
>        ...
>        [x.045240] smc_buf_get_slot cost:5528 ns, walk 196 buf_descs
>        [x.045389] smc_buf_get_slot cost:4721 ns, walk 197 buf_descs
>        [x.045537] smc_buf_get_slot cost:4075 ns, walk 198 buf_descs
>        [x.046010] smc_buf_get_slot cost:6476 ns, walk 199 buf_descs
>
>    2. Apply this patch:
>        [x.180857] smc_buf_get_slot_free cost:75 ns
>        [x.181001] smc_buf_get_slot_free cost:147 ns
>        [x.181128] smc_buf_get_slot_free cost:97 ns
>        [x.181282] smc_buf_get_slot_free cost:132 ns
>        [x.181451] smc_buf_get_slot_free cost:74 ns
>
>It can be seen from the data that it takes about 5~6us to traverse 200 
>times, and the time complexity of the lock-less linked algorithm is O(1).
>
>And my test process is only single-threaded. If multiple threads 
>establish SMC connections in parallel, locks will also become a 
>bottleneck, and lock-less linked can solve this problem well.
>
>SO I guess this patch should be beneficial in scenarios where a 
>large number of short connections are parallel?

Based on your data, I'm afraid the short-lived connection
test won't show much benificial. Since the time to complete a
SMC-R connection should be several orders of magnitude larger
than 100ns.

Best regards,
Dust


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ