lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220520001834.2247810-1-kuba@kernel.org>
Date:   Thu, 19 May 2022 17:18:32 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     davem@...emloft.net
Cc:     netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
        Jakub Kicinski <kuba@...nel.org>
Subject: [PATCH net-next v5 resend 0/2] Add a bhash2 table hashed by port + address

Joanne Koong says:

This patchset proposes adding a bhash2 table that hashes by port and address.
The motivation behind bhash2 is to expedite bind requests in situations where
the port has many sockets in its bhash table entry, which makes checking bind
conflicts costly especially given that we acquire the table entry spinlock
while doing so, which can cause softirq cpu lockups and can prevent new tcp
connections.

We ran into this problem at Meta where the traffic team binds a large number
of IPs to port 443 and the bind() call took a significant amount of time
which led to cpu softirq lockups, which caused packet drops and other failures
on the machine

The patches are as follows:
1/2 - Adds a second bhash table (bhash2) hashed by port and address
2/2 - Adds a test for timing how long an additional bind request takes when
the bhash entry is populated

When experimentally testing this on a local server for ~24k sockets bound to
the port, the results seen were:

ipv4:
before - 0.002317 seconds
with bhash2 - 0.000018 seconds

ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds

v4 -> v5:
v4:
https://lore.kernel.org/netdev/20220512205041.1208962-1-joannelkoong@gmail.com/

* Remove "inline" in check_bind2_bucket_match function
* Add line breaks to limit lengths to 80 columns
* Rebase to master and fix merge conflict

v3 -> v4:
v3:
https://lore.kernel.org/netdev/20220511000424.2223932-1-joannelkoong@gmail.com/

* Fix the fix for the dccp bhash2 allocation

v2 -> v3:
v2:
https://lore.kernel.org/netdev/20220510005316.3967597-1-joannelkoong@gmail.com/

* Fix bhash2 allocation error handling for dccp
* Rebase onto net-next/master

v1 -> v2:
v1:
https://lore.kernel.org/netdev/20220421221449.1817041-1-joannelkoong@gmail.com/

* Attached test for timing bind request

Joanne Koong (2):
  net: Add a second bind table hashed by port and address
  selftests: Add test for timing a bind request to a port with a
    populated bhash entry

 include/net/inet_connection_sock.h            |   3 +
 include/net/inet_hashtables.h                 |  68 ++++-
 include/net/sock.h                            |  14 +
 net/dccp/proto.c                              |  33 ++-
 net/ipv4/inet_connection_sock.c               | 247 +++++++++++++-----
 net/ipv4/inet_hashtables.c                    | 193 +++++++++++++-
 net/ipv4/tcp.c                                |  14 +-
 tools/testing/selftests/net/.gitignore        |   1 +
 tools/testing/selftests/net/Makefile          |   2 +
 tools/testing/selftests/net/bind_bhash_test.c | 119 +++++++++
 10 files changed, 611 insertions(+), 83 deletions(-)
 create mode 100644 tools/testing/selftests/net/bind_bhash_test.c

-- 
2.34.3

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ