netdev - Re: [PATCH net-next 0/4] bridge: improve cache utilization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170204134626.311e3742@xeon-e3>
Date:   Sat, 4 Feb 2017 13:46:26 -0800
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
Cc:     netdev@...r.kernel.org, roopa@...ulusnetworks.com,
        davem@...emloft.net, bridge@...ts.linux-foundation.org
Subject: Re: [PATCH net-next 0/4] bridge: improve cache utilization

On Sat,  4 Feb 2017 18:05:05 +0100
Nikolay Aleksandrov <nikolay@...ulusnetworks.com> wrote:

> Hi all,
> This is the first set which begins to deal with the bad bridge cache
> access patterns. The first patch rearranges the bridge and port structs
> a little so the frequently (and closely) accessed members are in the same
> cache line. The second patch then moves the garbage collection to a
> workqueue trying to improve system responsiveness under load (many fdbs)
> and more importantly removes the need to check if the matched entry is
> expired in __br_fdb_get which was a major source of false-sharing.
> The third patch is a preparation for the final one which
> If properly configured, i.e. ports bound to CPUs (thus updating "updated"
> locally) then the bridge's HitM goes from 100% to 0%, but even without
> binding we get a win because previously every lookup that iterated over
> the hash chain caused false-sharing due to the first cache line being
> used for both mac/vid and used/updated fields.
> 
> Some results from tests I've run:
> (note that these were run in good conditions for the baseline, everything
>  ran on a single NUMA node and there were only 3 fdbs)
> 
> 1. baseline
> 100% Load HitM on the fdbs (between everyone who has done lookups and hit
>                             one of the 3 hash chains of the communicating
>                             src/dst fdbs)
> Overall 5.06% Load HitM for the bridge, first place in the list
> 
> 2. patched & ports bound to CPUs
> 0% Local load HitM, bridge is not even in the c2c report list
> Also there's 3% consistent improvement in netperf tests.

What tool are you using to measure this?

> 
> Thanks,
>  Nik
> 
> Nikolay Aleksandrov (4):
>   bridge: modify bridge and port to have often accessed fields in one
>     cache line
>   bridge: move to workqueue gc
>   bridge: move write-heavy fdb members in their own cache line
>   bridge: fdb: write to used and updated at most once per jiffy
> 
>  net/bridge/br_device.c    |  1 +
>  net/bridge/br_fdb.c       | 34 +++++++++++++++++-----------
>  net/bridge/br_if.c        |  2 +-
>  net/bridge/br_input.c     |  3 ++-
>  net/bridge/br_ioctl.c     |  2 +-
>  net/bridge/br_netlink.c   |  2 +-
>  net/bridge/br_private.h   | 57 +++++++++++++++++++++++------------------------
>  net/bridge/br_stp.c       |  2 +-
>  net/bridge/br_stp_if.c    |  4 ++--
>  net/bridge/br_stp_timer.c |  2 --
>  net/bridge/br_sysfs_br.c  |  2 +-
>  11 files changed, 59 insertions(+), 52 deletions(-)

Looks good thanks, I wounder this impacts smaller work loads.

Reviewed-by: Stephen Hemminger <stephen@...workplumber.org>