[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZB2baIDIPhxj5Vdl@debian.me>
Date: Fri, 24 Mar 2023 19:45:28 +0700
From: Bagas Sanjaya <bagasdotme@...il.com>
To: Feng Tang <feng.tang@...el.com>, Jonathan Corbet <corbet@....net>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
Randy Dunlap <rdunlap@...radead.org>
Cc: Arnaldo Carvalho de Melo <acme@...hat.com>,
Joe Mario <jmario@...hat.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Eric Dumazet <edumazet@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>, dave.hansen@...el.com,
ying.huang@...el.com, tim.c.chen@...el.com, andi.kleen@...el.com
Subject: Re: [PATCH v1] Documentation: Add document for false sharing
On Fri, Mar 24, 2023 at 03:13:16PM +0800, Feng Tang wrote:
> +There are many real-world cases of performance regressions caused by
> +false sharing, and one is a rw_semaphore 'mmap_lock' inside struct
"... . One of these is rw_semaphore 'mmap_lock' ..."
But I think in English we commonly name things as "foobar struct"
instead of "struct foobar" (that is, common noun follow the proper noun
that names something).
> +* A global datum accessed (shared) by many CPUs
Global data?
> +Following 'mitigation' section provides real-world examples.
"The real-world examples are given in 'Possible mitigations' sections."
> + #perf c2c record -ag sleep 3
> + #perf c2c report --call-graph none -k vmlinux
Are these commands really run as root?
> +
> +Run it when testing will-it-scale's tlb_flush1 case, and the report
> +has pieces like::
"When running above during testing ..., perf reports something like::"
> +False sharing hurting performance cases are seen more frequently with
> +core count increasing, and there have been many patches merged to
> +solve it, like in networking and memory management subsystems. Some
> +common mitigations (with examples) are:
"... Because of these detrimental effects, many patches have been
proposed across variety of subsystems (like networking and memory
management) and merged."
> +
> +* Separate hot global data in its own dedicated cache line, even if it
> + is just a 'short' type. The downside is more consumption of memory,
> + cache line and TLB entries.
> +
> + Commit 91b6d3256356 ("net: cache align tcp_memory_allocated, tcp_sockets_allocated")
> +
> +* Reorganize the data structure, separate the interfering members to
> + different cache lines. One downside is it may introduce new false
> + sharing of other members.
> +
> + Commit 802f1d522d5f ("mm: page_counter: re-layout structure to reduce false sharing")
> +
> +* Replace 'write' with 'read' when possible, especially in loops.
> + Like for some global variable, use compare(read)-then-write instead
> + of unconditional write. For example, use:
"... For example, write::"
> +
> + if (!test_bit(XXX))
> + set_bit(XXX);
> +
> + instead of directly "set_bit(XXX);", similarly for atomic_t data.
> +
> + Commit 7b1002f7cfe5 ("bcache: fixup bcache_dev_sectors_dirty_add() multithreaded CPU false sharing")
> + Commit 292648ac5cf1 ("mm: gup: allow FOLL_PIN to scale in SMP")
> +
> +* Turn hot global data to 'per-cpu data + global data' when possible,
> + or reasonably increase the threshold for syncing per-cpu data to
> + global data, to reduce or postpone the 'write' to that global data.
> +
> + Commit 520f897a3554 ("ext4: use percpu_counters for extent_status cache hits/misses")
> + Commit 56f3547bfa4d ("mm: adjust vm_committed_as_batch according to vm overcommit policy")
IMO it's odd to jump to specifying example commits without some sort of
conjuction (e.g. "for example, see commit <commit>").
> +
> +Surely, all mitigations should be carefully verified to not cause side
> +effects. And to avoid false sharing in advance during coding, it's
> +better to:
> +
> +* Be aware of cache line boundaries
> +* Group mostly read-only fields together
> +* Group things that are written at the same time together
> +* Separate known read-mostly and written-mostly fields
Proactively prevent false sharing with above tips?
Thanks.
--
An old man doll... just what I always wanted! - Clara
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists