[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a9a50b48-d85f-4465-a7b0-dec8b3f49281@blackwall.org>
Date: Thu, 30 May 2024 20:27:20 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Petr Machata <petrm@...dia.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org,
Ido Schimmel <idosch@...dia.com>, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH net-next 0/4] Allow configuration of multipath hash seed
On 5/30/24 18:25, Petr Machata wrote:
>
> Nikolay Aleksandrov <razor@...ckwall.org> writes:
>
>> I think that using memory management for such simple task is an
>> overkill. Would it be simpler to define 2 x 4 byte seed variables
>> in netns_ipv4 (e.g. user_seed, mp_seed). One is set only by the
>> user through the sysctl, which would also set mp_seed. Then you
>> can use mp_seed in the fast-path to construct that siphash key.
>> If the user_seed is set to 0 then you reset to some static init
>> hash value that is generated on init_net's creation. The idea
>
> Currently the flow dissector siphash key is initialized lazily so that
> the pool of random bytes is full when the initialization is done:
>
> https://lore.kernel.org/all/20131023180527.GC2403@order.stressinduktion.org
> https://lore.kernel.org/all/20131023111219.GA31531@order.stressinduktion.org
>
> I'm not sure how important that is -- the mailing does not really
> discuss much in the way of rationale, and admits it's not critical. But
> initializing the seed during net init would undo that. At the same time,
> initializing it lazily would be a bit of a mess, as we would have to
> possibly retroactively update mp_hash as well, which would be racy vs.
> user_hash update unless locked. So dunno.
If you want to keep the late init, untested:
init_mp_seed_once() -> DO_ONCE(func (net_get_random_once(&init_mp_seed),
memcpy(&init_net.mp_seed, &init_mp_seed))
>
> If you are OK with giving up on the siphash key "quality", I'm fine with
> this.
>
IMO that's fine, early init of the seed wouldn't be a problem. net's
hash_mix is already initialized early.
> Alternatively I can keep the dispatch in like it currently is. I.e.:
>
> if (user_seed) {
> sip_hash = construct(user_seed)
> return flow_hash_from_keys_seed(sip_hash)
> } else {
> return flow_hash_from_keys()
> }
>
> I wanted to have the flow dispatcher hash init early as well, as it made
> the code branch-free like you note below, but then Ido dug out that
+1
> there are $reasons for how it's currently done.
> >> is to avoid leaking that initial seed, to have the same seed
>> for all netns (known behaviour), be able to recognize when a
>> seed was set and if the user sets a seed then overwrite it for
>> that ns, but to be able to reset it as well.
>> Since 32 bits are enough I don't see why we should be using
>> the flow hash seed, note that init_net's initialization already
>
> No deep reason in using the dissector hash as far as I'm concerned.
> I just didn't want to change things arbitrarily, so kept the current
> behavior except where I needed it to change.
>
>> uses get_random_bytes() for hashes. This seems like a simpler
>> scheme that doesn't require memory management for a 32 bit seed.
>> Also it has the benefit that it will remove the test when generating
>> a hash because in the initial/non-user-set case we just have the
>> initial seed in mp_seed which is used to generate the siphash key,
>> i.e. we always use that internal seed for the hash, regardless if
>> it was set by the user or it's the initial seed.
>>
>> That's just one suggestion, if you decide to use more memory you
>> can keep the whole key in netns_ipv4 instead, the point is I don't
>> think we need memory management for this value.
>
> I kept the RCU stuff in because it makes it easy to precompute the
> siphash key while allowing readers to access it lock-free. I could
> inline it and guard with a seqlock instead, but that's a bit messier
> code-wise. Or indeed construct in-situ, it's an atomic access plus like
> four instructions or something like that.
You can READ/WRITE_ONCE() the full 8 bytes every time so it's lock-free
and consistent view of both values for observers. For fast-path it'll
only be accessing one of the two values, so it's fine either way. You
can use barriers to ensure latest value is seen by interested readers,
but for most eventual consistency would be enough.
Powered by blists - more mailing lists