[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9A3D7F61-23CC-4A2F-9022-EAA17ECFA690@amazon.com>
Date: Sat, 22 Jul 2023 03:17:53 +0000
From: "Smith, Stewart" <trawets@...zon.com>
To: "Iwashima, Kuniyuki" <kuniyu@...zon.co.jp>
CC: "aksecurity@...il.com" <aksecurity@...il.com>, "Herrenschmidt, Benjamin"
<benh@...zon.com>, "David S . Miller" <davem@...emloft.net>, David Ahern
<dsahern@...nel.org>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, "kuni1840@...il.com" <kuni1840@...il.com>, "Iwashima,
Kuniyuki" <kuniyu@...zon.co.jp>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>, "Mendoza-Jonas,
Samuel" <samjonas@...zon.com>
Subject: Re: [PATCH v2 net] tcp: Reduce chance of collisions in
inet6_hashfn().
On Jul 21, 2023, at 6:39 PM, Iwashima, Kuniyuki <kuniyu@...zon.co.jp> wrote:
>
> From: Amit Klein <aksecurity@...il.com>
> Date: Sat, 22 Jul 2023 03:07:49 +0300
>> Resending because some recipients require plaintext email. Sorry.
>> Original message:
>>
>> This is certainly better than the original code.
>
> Thanks for reviewing!
>
>
>>
>> Two remarks:
>>
>> 1. In general, using SipHash is more secure, even if only for its
>> longer key (128 bits, cf. jhash's 32 bits), which renders simple
>> enumeration attacks infeasible. I understand that in a different
>> context, switching from jhash to siphash incurred some overhead, but
>> maybe here it won't.
>
> I see. Stewart tested hsiphash and observed more overhead as
> noted in the changelog, but let me give another shot to SipHash
> and HSiphash.
When originally looking at what to do for the collisions, it did seem like siphash would be the better hash, but I did have some concern around what the real-world performance impact could be, and wanted to have something that would alleviate the issue at hand that nobody could even *possibly* remotely contemplate complaining was a problem to apply the patch to their systems because of “performance”.. which was why keeping jhash but with modifications was where we started.
Two things about siphash/hsiphash that I had open questions about:
- is the extra overhead of the hash calculation going to be noticeable at all in any regular workload
- all the jhash stuff gets inlined nicely and the compiler does wonderful things, but it’s possible that siphash will end up always with a jump, which would add more to the extra overhead (and that llvm-mca doesn’t really model well, so it was harder to prove in sim). Again, not quite sure the real world impact to real world workloads.
>
> I'll report back here next week.
>
>
>>
>> 2. Taking a more holistic approach to __ipv6_addr_jhash(), I surveyed
>> where and how it's used. In most cases, it is used for hashing
>> together the IPv6 local address, foreign address and optionally some
>> more data (e.g. layer 4 protocol number, layer 4 ports).
>> Security-wise, it makes more sense to hash all data together once, and
>> not piecewise as it's done today (i.e. today it's
>> jhash(....,jhash(faddr),...), which cases the faddr into 32 bits,
>> whereas the recommended way is to hash all items in their entirety,
>> i.e. jhash(...,faddr,...)).
>
> Agree.
When looking at this originally, I was trying to work out what was intentional for some reason I wasn’t smart enough to understand and what had just grown over the past (eep.. nearly 3 decades I think) and could do with a bit of a rethink.
>> This requires scrutinizing all 6
>> invocations of __ipv6_addr_jhash() one by one and modifying the
>> calling code accordingly.
>
> At a glance, only rds_conn_bucket() seems a little bit tricky
> as it uses v4 hash function later.
>
> But I'll take a deeper look.
Powered by blists - more mailing lists