netdev - Re: [PATCH net-next v6 05/12] net: homa: create homa_rpc.h and homa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9209dfbb-ca3a-4fb7-a2fb-0567394f8cda@redhat.com>
Date: Tue, 28 Jan 2025 09:19:54 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: John Ousterhout <ouster@...stanford.edu>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org,
 kuba@...nel.org
Subject: Re: [PATCH net-next v6 05/12] net: homa: create homa_rpc.h and
 homa_rpc.c

On 1/27/25 7:03 PM, John Ousterhout wrote:
> On Mon, Jan 27, 2025 at 2:02 AM Paolo Abeni <pabeni@...hat.com> wrote:
>> On 1/27/25 6:22 AM, John Ousterhout wrote:
>>> On Thu, Jan 23, 2025 at 6:30 AM Paolo Abeni <pabeni@...hat.com> wrote:
>>>> ...
>>>> How many RPCs should concurrently exist in a real server? with 1024
>>>> buckets there could be a lot of them on each/some list and linear search
>>>> could be very expansive. And this happens with BH disabled.
>>>
>>> Server RPCs tend to be short-lived, so my best guess is that the
>>> number of concurrent server RPCs will be relatively small (maybe a few
>>> hundred?). But this is just a guess: I won't know for sure until I can
>>> measure Homa in production use. If the number of concurrent RPCs turns
>>> out to be huge then we'll have to find a different solution.
>>>
>>>>> +
>>>>> +     /* Initialize fields that don't require the socket lock. */
>>>>> +     srpc = kmalloc(sizeof(*srpc), GFP_ATOMIC);
>>>>
>>>> You could do the allocation outside the bucket lock, too and avoid the
>>>> ATOMIC flag.
>>>
>>> In many cases this function will return an existing RPC so there won't
>>> be any need to allocate; I wouldn't want to pay the allocation
>>> overhead in that case. I could conceivably check the offset in the
>>> packet and pre-allocate if the offset is zero (in this case it's
>>> highly unlikely that there will be an existing RPC).
>>
>> If you use RCU properly here, you could do a lockless lookup. If such
>> lookup fail, you could do the allocation still outside the lock and
>> avoiding it in most of cases.
> 
> I think that might work, but it would suffer from the slow reclamation
> problem I mentioned with RCU. It would also create more complexity in
> the code (e.g. the allocation might still turn out to be redundant, so
> there would need to be additional code to check for that: the lookup
> would essentially have to be done twice in the case of creating a new
> RPC). I'd rather not incur this complexity until there's evidence that
> GFP_ATOMIC is causing problems.

Have a look at tcp established socket lookup and the
SLAB_TYPESAFE_BY_RCU flag usage for slab-based allocations. A combo of
such flag for RPC allocation (using a dedicated kmem_cache) and RCU
lookup should improve consistently the performances, with a consolidate
code layout and no unmanageable problems with large number of objects
waiting for the grace period.

>>> Homa needs to handle a very high rate of RPCs, so this would result in
>>> too much accumulated memory  (in particular, skbs don't get reclaimed
>>> until the RPC is reclaimed).
>>
>> For the RPC struct, that above is a fair point, but why skbs need to be
>> freed together with the RCP struct? if you have skbs i.e. sitting in a
>> RX queue, you can flush such queue when the RPC goes out of scope,
>> without any additional delay.
> 
> Reclaiming the skbs inline would be too expensive; 

Why? For other protocols the main skb free cost is due to memory
accounting, that homa is currently not implementing, so I don't see why
it should be critically expansive at this point (note that homa should
performat least rmem/wmem accounting, but let put this aside for a
moment). Could you please elaborate on this topic?

>>> The caller must have a lock on the homa_rpc anyway, so RCU wouldn't
>>> save the overhead of acquiring a lock. The reason for putting the lock
>>> in the hash table instead of the homa_rpc is that this makes RPC
>>> creation/deletion atomic with respect to lookups. The lock was
>>> initially in the homa_rpc, but that led to complex races with hash
>>> table insertion/deletion. This is explained in sync.txt, but of course
>>> you don't have that (yet).
>>
>> The per bucket RPC lock is prone to contention, a per RPC lock will
>> avoid such problem.
> 
> There are a lot of buckets (1024); this was done intentionally to
> reduce the likelihood of contention between different RPCs  trying to
> acquire the same bucket lock. 

1024 does not look to big for internet standard, but I must admit the
usage pattern is not 110% clear to me.

[...]
> Note that the bucket locks would be needed even with RCU usage, in
> order to permit concurrent RPC creation in different buckets. Thus
> Homa's locking scheme doesn't introduce additional locks; it
> eliminates locks that would otherwise be needed on individual RPCs and
> uses the bucket locks for 2 purposes.

It depends on the relative frequency of RPC lookup vs RPC
insertion/deletion. i.e. for TCP connections the lookup frequency is
expected to be significantly higher than the socket creation and
destruction.

I understand the expected patter in quite different with homa RPC? If so
you should at least consider a dedicated kmem_cache for such structs.

/P