[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <400f896ae17c66f9f68d9b5f248c13cd5ece6158.camel@redhat.com>
Date: Fri, 13 May 2022 14:22:28 +0200
From: Nicolas Saenz Julienne <nsaenzju@...hat.com>
To: Mel Gorman <mgorman@...hsingularity.net>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Marcelo Tosatti <mtosatti@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock
On Thu, 2022-05-12 at 09:50 +0100, Mel Gorman wrote:
> Currently the PCP lists are protected by using local_lock_irqsave to
> prevent migration and IRQ reentrancy but this is inconvenient. Remote
> draining of the lists is impossible and a workqueue is required and
> every task allocation/free must disable then enable interrupts which
> is
> expensive.
>
> As preparation for dealing with both of those problems, protect the
> lists with a spinlock. The IRQ-unsafe version of the lock is used
> because IRQs are already disabled by local_lock_irqsave. spin_trylock
> is used in preparation for a time when local_lock could be used
> instead
> of lock_lock_irqsave.
>
> The per_cpu_pages still fits within the same number of cache lines
> after
> this patch relative to before the series.
>
> struct per_cpu_pages {
> spinlock_t lock; /* 0
> 4 */
> int count; /* 4
> 4 */
> int high; /* 8
> 4 */
> int batch; /* 12
> 4 */
> short int free_factor; /* 16
> 2 */
> short int expire; /* 18
> 2 */
>
> /* XXX 4 bytes hole, try to pack */
>
> struct list_head lists[13]; /* 24
> 208 */
>
> /* size: 256, cachelines: 4, members: 7 */
> /* sum members: 228, holes: 1, sum holes: 4 */
> /* padding: 24 */
> } __attribute__((__aligned__(64)));
>
> There is overhead in the fast path due to acquiring the spinlock even
> though the spinlock is per-cpu and uncontended in the common case.
> Page
> Fault Test (PFT) running on a 1-socket reported the following results
> on
> a 1 socket machine.
>
> 5.18.0-rc1 5.18.0-
> rc1
> vanilla mm-pcpdrain-
> v2r1
> Hmean faults/sec-1 886331.5718 ( 0.00%) 885462.7479 ( -
> 0.10%)
> Hmean faults/sec-3 2337706.1583 ( 0.00%) 2332130.4909 * -
> 0.24%*
> Hmean faults/sec-5 2851594.2897 ( 0.00%) 2844123.9307 ( -
> 0.26%)
> Hmean faults/sec-7 3543251.5507 ( 0.00%) 3516889.0442 * -
> 0.74%*
> Hmean faults/sec-8 3947098.0024 ( 0.00%) 3916162.8476 * -
> 0.78%*
> Stddev faults/sec-1 2302.9105 ( 0.00%) 2065.0845 (
> 10.33%)
> Stddev faults/sec-3 7275.2442 ( 0.00%) 6033.2620 (
> 17.07%)
> Stddev faults/sec-5 24726.0328 ( 0.00%) 12525.1026 (
> 49.34%)
> Stddev faults/sec-7 9974.2542 ( 0.00%) 9543.9627 (
> 4.31%)
> Stddev faults/sec-8 9468.0191 ( 0.00%) 7958.2607 (
> 15.95%)
> CoeffVar faults/sec-1 0.2598 ( 0.00%) 0.2332 (
> 10.24%)
> CoeffVar faults/sec-3 0.3112 ( 0.00%) 0.2587 (
> 16.87%)
> CoeffVar faults/sec-5 0.8670 ( 0.00%) 0.4404 (
> 49.21%)
> CoeffVar faults/sec-7 0.2815 ( 0.00%) 0.2714 (
> 3.60%)
> CoeffVar faults/sec-8 0.2399 ( 0.00%) 0.2032 (
> 15.28%)
>
> There is a small hit in the number of faults per second but given
> that
> the results are more stable, it's borderline noise.
>
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>
> Tested-by: Minchan Kim <minchan@...nel.org>
> Acked-by: Minchan Kim <minchan@...nel.org>
> ---
Reviewed-by: Nicolas Saenz Julienne <nsaenzju@...hat.com>
Thanks,
--
Nicolás Sáenz
Powered by blists - more mailing lists