[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170204134626.311e3742@xeon-e3>
Date: Sat, 4 Feb 2017 13:46:26 -0800
From: Stephen Hemminger <stephen@...workplumber.org>
To: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
Cc: netdev@...r.kernel.org, roopa@...ulusnetworks.com,
davem@...emloft.net, bridge@...ts.linux-foundation.org
Subject: Re: [PATCH net-next 0/4] bridge: improve cache utilization
On Sat, 4 Feb 2017 18:05:05 +0100
Nikolay Aleksandrov <nikolay@...ulusnetworks.com> wrote:
> Hi all,
> This is the first set which begins to deal with the bad bridge cache
> access patterns. The first patch rearranges the bridge and port structs
> a little so the frequently (and closely) accessed members are in the same
> cache line. The second patch then moves the garbage collection to a
> workqueue trying to improve system responsiveness under load (many fdbs)
> and more importantly removes the need to check if the matched entry is
> expired in __br_fdb_get which was a major source of false-sharing.
> The third patch is a preparation for the final one which
> If properly configured, i.e. ports bound to CPUs (thus updating "updated"
> locally) then the bridge's HitM goes from 100% to 0%, but even without
> binding we get a win because previously every lookup that iterated over
> the hash chain caused false-sharing due to the first cache line being
> used for both mac/vid and used/updated fields.
>
> Some results from tests I've run:
> (note that these were run in good conditions for the baseline, everything
> ran on a single NUMA node and there were only 3 fdbs)
>
> 1. baseline
> 100% Load HitM on the fdbs (between everyone who has done lookups and hit
> one of the 3 hash chains of the communicating
> src/dst fdbs)
> Overall 5.06% Load HitM for the bridge, first place in the list
>
> 2. patched & ports bound to CPUs
> 0% Local load HitM, bridge is not even in the c2c report list
> Also there's 3% consistent improvement in netperf tests.
What tool are you using to measure this?
>
> Thanks,
> Nik
>
> Nikolay Aleksandrov (4):
> bridge: modify bridge and port to have often accessed fields in one
> cache line
> bridge: move to workqueue gc
> bridge: move write-heavy fdb members in their own cache line
> bridge: fdb: write to used and updated at most once per jiffy
>
> net/bridge/br_device.c | 1 +
> net/bridge/br_fdb.c | 34 +++++++++++++++++-----------
> net/bridge/br_if.c | 2 +-
> net/bridge/br_input.c | 3 ++-
> net/bridge/br_ioctl.c | 2 +-
> net/bridge/br_netlink.c | 2 +-
> net/bridge/br_private.h | 57 +++++++++++++++++++++++------------------------
> net/bridge/br_stp.c | 2 +-
> net/bridge/br_stp_if.c | 4 ++--
> net/bridge/br_stp_timer.c | 2 --
> net/bridge/br_sysfs_br.c | 2 +-
> 11 files changed, 59 insertions(+), 52 deletions(-)
Looks good thanks, I wounder this impacts smaller work loads.
Reviewed-by: Stephen Hemminger <stephen@...workplumber.org>
Powered by blists - more mailing lists