netdev - Re: [RFC net-next 0/2] Optimize the parallelism of SMC-R connections

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <c0ba8e0b-f2b2-b65b-e21a-54c3d920ba72@linux.ibm.com>
Date: Thu, 21 Sep 2023 14:36:36 +0200
From: Alexandra Winter <wintera@...ux.ibm.com>
To: "D. Wythe" <alibuda@...ux.alibaba.com>, kgraul@...ux.ibm.com,
        wenjia@...ux.ibm.com, jaka@...ux.ibm.com
Cc: kuba@...nel.org, davem@...emloft.net, netdev@...r.kernel.org,
        linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [RFC net-next 0/2] Optimize the parallelism of SMC-R connections



On 18.09.23 05:58, D. Wythe wrote:
> Hi Alexandra,
> 
> Sorry for the late reply. I have been thinking about the question you mentioned for a while, and this is a great opportunity to discuss this issue.
> My point is that the purpose of the locks is to minimize the expansion of the number of link groups as much as possible.
> 
> As we all know, the SMC-R protocol has the following specifications:
> 
>  * A SMC-R connection MUST be mapped into one link group.
>  * A link group is usually created by a connection, which is also known
>    as "First Contact."
> 
> If we start from scratch, we can design the connection process as follows:
> 
> 1. Check if there are any available link groups. If so, map the
>    connection into it and go to step 3.
> 2. Mark this connection as "First Contact," create a link group, and
>    mark the new link group as unavailable.
> 3. Finish connection establishment.
> 4. If the connection is "First Contact," mark the new link group as
>    available and map the connection into it.
> 
> I think there is no logical problem with this process, but there is a practical issue where burst traffic can result in burst link groups.
> 
> For example, if there are 10,000 incoming connections, based on the above logic, the most extreme scenario would be to create 10,000 link groups.
> This can cause significant memory pressure and even be used for security attacks.
> 
> To address this goal, the simplest way is to make each connection process mutually exclusive, having the following process:
> 
> 1. Block other incoming connections.
> 2. Check if there are any available link groups. If so, map the
>    connection into it and go to step 4.
> 3. Mark this connection as "First Contact," create a link group, and
>    mark it as unavailable.
> 4. Finish connection establishment.
> 5. If the connection is "First Contact," mark the new link group as
>    available and map the connection into it.
> 6. Allow other connections to come in.
> 
> And this is our current process now!
> 
> Regarding the purpose of the locks, to minimize the expansion of the number of link groups. If we agree with this point, we can observe that
> in phase 2 going to phase 4, this process will never create a new link group. Obviously, the lock is not needed here.

Well, you still have issue of a link group going away. Thread 1 is deleting the last connection from a link group and shutting it down. Thread 2 is adding a 'second' connection (from its poitn ov view) to the linkgroup.

> 
> Then the last question: why is the lock needed until after smc_clc_send_confirm in the new-LGR case? We can try to move phase 6 ahead as follows:
> 
> 1. Block other incoming connections.
> 2. Check if there are any available link groups. If so, map the
>    connection into it and go to step 4.
> 3. Mark this connection as "First Contact," create a link group, and
>    mark it as unavailable.
> 4. Allow other connections to come in.
> 5. Finish connection establishment.
> 6. If the connection is "First Contact," mark the new link group as
>    available and map the connection into it.
> 
> There is also no problem with this process! However, note that this logic does not address burst issues.
> Burst traffic will still result in burst link groups because a new link group can only be marked as available when the "First Contact" is completed,
> which is after sending the CLC Confirm.
> 
> Hope my point is helpful to you. If you have any questions, please let me know. Thanks.
> 
> Best wishes,
> D. Wythe

You are asking exactly the right questions here. Creation of new connections is on the critical path,
and if the design can be optimized for parallelism that will increase perfromance, while insufficient
locking will create nasty bugs.
Many programmers have dealt with these issues before us. I would recommend to consult existing proven
patterns; e.g. the ones listed in Paul McKenney's book 
(https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/) 
e.g. 'Chapter 10.3 Read-Mostly Data Structures' and of course the kernel documentation folder.
Improving an existing codebase like smc without breaking is not trivial. Obviuosly a step-by-step approach,
works best. So if you can identify actions that can be be done under a smaller (as in more granular) lock
instead of under a global lock. OR change a mutex into R/W or RCU.
Smaller changes are easier to review (and bisect in case of regressions).