[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d0e93a4ec0594c38817ea9ad075aa122@AcuMS.aculab.com>
Date: Wed, 26 Jul 2023 15:42:47 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Paolo Abeni' <pabeni@...hat.com>, "'willemdebruijn.kernel@...il.com'"
<willemdebruijn.kernel@...il.com>, "'davem@...emloft.net'"
<davem@...emloft.net>, "'dsahern@...nel.org'" <dsahern@...nel.org>, "'Eric
Dumazet'" <edumazet@...gle.com>, "'kuba@...nel.org'" <kuba@...nel.org>,
"'netdev@...r.kernel.org'" <netdev@...r.kernel.org>
Subject: RE: [PATCH 1/2] Move hash calculation inside udp4_lib_lookup2()
From: Paolo Abeni
> Sent: 26 July 2023 14:44
>
> On Wed, 2023-07-26 at 12:05 +0000, David Laight wrote:
> > Pass the udptable address into udp4_lib_lookup2() instead of the hash slot.
> >
> > While ipv4_portaddr_hash(net, IP_ADDR_ANY, 0) is constant for each net
> > (the port is an xor) the value isn't saved.
> > Since the hash function doesn't get simplified when passed zero the hash
>
> Are you sure? could you please objdump and compare the binary code
> generated before and after the patch? In theory all the callers up to
> __jhash_final() included should be inlined, and the compiler should be
> able to optimze at least rol32(0, <n>).
From an oldish build I have lurking:
0000000000003960 <__udp4_lib_lookup>:
3960: e8 00 00 00 00 callq 3965 <__udp4_lib_lookup+0x5>
3961: R_X86_64_PC32 __fentry__-0x4
3965: 48 83 ec 48 sub $0x48,%rsp
3969: 66 41 c1 c0 08 rol $0x8,%r8w
396e: 48 89 5c 24 18 mov %rbx,0x18(%rsp)
3973: 48 89 6c 24 20 mov %rbp,0x20(%rsp)
3978: 89 f5 mov %esi,%ebp
397a: 4c 89 64 24 28 mov %r12,0x28(%rsp)
397f: 4c 89 6c 24 30 mov %r13,0x30(%rsp)
3984: 44 0f b7 e2 movzwl %dx,%r12d
3988: 4c 89 74 24 38 mov %r14,0x38(%rsp)
398d: 4c 89 7c 24 40 mov %r15,0x40(%rsp)
3992: 49 89 fe mov %rdi,%r14
3995: 8b bf 40 01 00 00 mov 0x140(%rdi),%edi
399b: 45 0f b7 f8 movzwl %r8w,%r15d
399f: 48 8b 5c 24 58 mov 0x58(%rsp),%rbx
39a4: 48 8b 54 24 60 mov 0x60(%rsp),%rdx
39a9: 45 89 cd mov %r9d,%r13d
39ac: 81 ef 0d 41 52 21 sub $0x2152410d,%edi
39b2: 89 fe mov %edi,%esi
39b4: 8d 04 39 lea (%rcx,%rdi,1),%eax
39b7: 48 89 54 24 10 mov %rdx,0x10(%rsp)
39bc: c1 c6 0e rol $0xe,%esi
39bf: 44 89 e2 mov %r12d,%edx
39c2: f7 de neg %esi
39c4: 41 89 f0 mov %esi,%r8d
39c7: 31 f0 xor %esi,%eax
39c9: 41 c1 c0 0b rol $0xb,%r8d
39cd: 44 29 c0 sub %r8d,%eax
39d0: 41 89 c0 mov %eax,%r8d
39d3: 31 c7 xor %eax,%edi
39d5: 41 c1 c0 19 rol $0x19,%r8d
39d9: 44 29 c7 sub %r8d,%edi
39dc: 41 89 f8 mov %edi,%r8d
39df: 31 fe xor %edi,%esi
39e1: 41 c1 c0 10 rol $0x10,%r8d
39e5: 44 29 c6 sub %r8d,%esi
39e8: 41 89 f0 mov %esi,%r8d
39eb: 31 f0 xor %esi,%eax
39ed: 41 c1 c0 04 rol $0x4,%r8d
39f1: 44 29 c0 sub %r8d,%eax
39f4: 45 89 f8 mov %r15d,%r8d
39f7: 31 c7 xor %eax,%edi
39f9: c1 c0 0e rol $0xe,%eax
39fc: 29 c7 sub %eax,%edi
39fe: 89 f8 mov %edi,%eax
3a00: c1 c7 18 rol $0x18,%edi
3a03: 31 f0 xor %esi,%eax
3a05: 89 ee mov %ebp,%esi
3a07: 29 f8 sub %edi,%eax
3a09: 4c 89 f7 mov %r14,%rdi
3a0c: 44 31 f8 xor %r15d,%eax
3a0f: 23 43 10 and 0x10(%rbx),%eax
3a12: 48 c1 e0 04 shl $0x4,%rax
3a16: 48 03 43 08 add 0x8(%rbx),%rax
3a1a: 48 89 44 24 08 mov %rax,0x8(%rsp)
3a1f: 8b 44 24 50 mov 0x50(%rsp),%eax
3a23: 89 04 24 mov %eax,(%rsp)
3a26: e8 05 f7 ff ff callq 3130 <udp4_lib_lookup2>
3a2b: 48 85 c0 test %rax,%rax
3a2e: 74 32 je 3a62 <__udp4_lib_lookup+0x102>
3a30: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
3a36: ba 00 00 00 00 mov $0x0,%edx
3a3b: 48 8b 5c 24 18 mov 0x18(%rsp),%rbx
3a40: 48 0f 43 c2 cmovae %rdx,%rax
3a44: 48 8b 6c 24 20 mov 0x20(%rsp),%rbp
3a49: 4c 8b 64 24 28 mov 0x28(%rsp),%r12
3a4e: 4c 8b 6c 24 30 mov 0x30(%rsp),%r13
3a53: 4c 8b 74 24 38 mov 0x38(%rsp),%r14
3a58: 4c 8b 7c 24 40 mov 0x40(%rsp),%r15
3a5d: 48 83 c4 48 add $0x48,%rsp
3a61: c3 retq
3a62: 41 8b 96 40 01 00 00 mov 0x140(%r14),%edx
3a69: 45 89 e9 mov %r13d,%r9d
3a6c: 45 89 f8 mov %r15d,%r8d
3a6f: 4c 89 f7 mov %r14,%rdi
3a72: 81 ea 0d 41 52 21 sub $0x2152410d,%edx
3a78: 89 d0 mov %edx,%eax
3a7a: c1 c0 0e rol $0xe,%eax
3a7d: f7 d8 neg %eax
3a7f: 89 c1 mov %eax,%ecx
3a81: 89 c6 mov %eax,%esi
3a83: 31 d1 xor %edx,%ecx
3a85: c1 c6 0b rol $0xb,%esi
3a88: 29 f1 sub %esi,%ecx
3a8a: 89 ce mov %ecx,%esi
3a8c: 31 ca xor %ecx,%edx
3a8e: c1 c6 19 rol $0x19,%esi
3a91: 29 f2 sub %esi,%edx
3a93: 89 d6 mov %edx,%esi
3a95: 31 d0 xor %edx,%eax
3a97: c1 c6 10 rol $0x10,%esi
3a9a: 29 f0 sub %esi,%eax
3a9c: 89 c6 mov %eax,%esi
3a9e: 31 c1 xor %eax,%ecx
3aa0: c1 c6 04 rol $0x4,%esi
3aa3: 29 f1 sub %esi,%ecx
3aa5: 89 ee mov %ebp,%esi
3aa7: 31 ca xor %ecx,%edx
3aa9: c1 c1 0e rol $0xe,%ecx
3aac: 29 ca sub %ecx,%edx
3aae: 48 8b 4c 24 60 mov 0x60(%rsp),%rcx
3ab3: 31 d0 xor %edx,%eax
3ab5: c1 c2 18 rol $0x18,%edx
3ab8: 29 d0 sub %edx,%eax
3aba: 44 89 e2 mov %r12d,%edx
3abd: 48 89 4c 24 10 mov %rcx,0x10(%rsp)
3ac2: 44 31 f8 xor %r15d,%eax
3ac5: 23 43 10 and 0x10(%rbx),%eax
3ac8: 31 c9 xor %ecx,%ecx
3aca: 48 c1 e0 04 shl $0x4,%rax
3ace: 48 03 43 08 add 0x8(%rbx),%rax
3ad2: 48 89 44 24 08 mov %rax,0x8(%rsp)
3ad7: 8b 44 24 50 mov 0x50(%rsp),%eax
3adb: 89 04 24 mov %eax,(%rsp)
3ade: e8 4d f6 ff ff callq 3130 <udp4_lib_lookup2>
3ae3: e9 48 ff ff ff jmpq 3a30 <__udp4_lib_lookup+0xd0>
The two copies of the hash are pretty much the same.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists