[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ace650b-5697-4fc4-91f9-4857fa64feea@redhat.com>
Date: Mon, 27 Jan 2025 11:01:55 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: John Ousterhout <ouster@...stanford.edu>
Cc: netdev@...r.kernel.org, edumazet@...gle.com, horms@...nel.org,
kuba@...nel.org
Subject: Re: [PATCH net-next v6 05/12] net: homa: create homa_rpc.h and
homa_rpc.c
On 1/27/25 6:22 AM, John Ousterhout wrote:
> On Thu, Jan 23, 2025 at 6:30 AM Paolo Abeni <pabeni@...hat.com> wrote:
>> ...
>> How many RPCs should concurrently exist in a real server? with 1024
>> buckets there could be a lot of them on each/some list and linear search
>> could be very expansive. And this happens with BH disabled.
>
> Server RPCs tend to be short-lived, so my best guess is that the
> number of concurrent server RPCs will be relatively small (maybe a few
> hundred?). But this is just a guess: I won't know for sure until I can
> measure Homa in production use. If the number of concurrent RPCs turns
> out to be huge then we'll have to find a different solution.
>
>>> +
>>> + /* Initialize fields that don't require the socket lock. */
>>> + srpc = kmalloc(sizeof(*srpc), GFP_ATOMIC);
>>
>> You could do the allocation outside the bucket lock, too and avoid the
>> ATOMIC flag.
>
> In many cases this function will return an existing RPC so there won't
> be any need to allocate; I wouldn't want to pay the allocation
> overhead in that case. I could conceivably check the offset in the
> packet and pre-allocate if the offset is zero (in this case it's
> highly unlikely that there will be an existing RPC).
If you use RCU properly here, you could do a lockless lookup. If such
lookup fail, you could do the allocation still outside the lock and
avoiding it in most of cases.
>>> +/**
>>> + * homa_find_client_rpc() - Locate client-side information about the RPC that
>>> + * a packet belongs to, if there is any. Thread-safe without socket lock.
>>> + * @hsk: Socket via which packet was received.
>>> + * @id: Unique identifier for the RPC.
>>> + *
>>> + * Return: A pointer to the homa_rpc for this id, or NULL if none.
>>> + * The RPC will be locked; the caller must eventually unlock it
>>> + * by invoking homa_rpc_unlock.
>>
>> Why are using this lock schema? It looks like it adds quite a bit of
>> complexity. The usual way of handling this kind of hash lookup is do the
>> lookup locklessly, under RCU, and eventually add a refcnt to the
>> looked-up entity - homa_rpc - to ensure it will not change under the
>> hood after the lookup.
>
> I considered using RCU for this, but the time period for RCU
> reclamation is too long (10's - 100's of ms, if I recall correctly).
RCU grace period usually extend on a kernel jiffy (1-10 ms depending on
your kernel build option).
> Homa needs to handle a very high rate of RPCs, so this would result in
> too much accumulated memory (in particular, skbs don't get reclaimed
> until the RPC is reclaimed).
For the RPC struct, that above is a fair point, but why skbs need to be
freed together with the RCP struct? if you have skbs i.e. sitting in a
RX queue, you can flush such queue when the RPC goes out of scope,
without any additional delay.
> The caller must have a lock on the homa_rpc anyway, so RCU wouldn't
> save the overhead of acquiring a lock. The reason for putting the lock
> in the hash table instead of the homa_rpc is that this makes RPC
> creation/deletion atomic with respect to lookups. The lock was
> initially in the homa_rpc, but that led to complex races with hash
> table insertion/deletion. This is explained in sync.txt, but of course
> you don't have that (yet).
The per bucket RPC lock is prone to contention, a per RPC lock will
avoid such problem.
> This approach is unusual, but it has worked out really well. Before
> implementing this approach I had what seemed like a never-ending
> stream of synchronization problems over the socket hash tables; each
> "fix" introduced new problems. Once I implemented this, all the
> problems went away and the code has been very stable ever since
> (several years now).
Have you tried running a fuzzer on this code? I bet syzkaller will give
a lot of interesting results, if you teach it about the homa APIs.
/P
Powered by blists - more mailing lists