linux-kernel - Re: [PATCH] mm: make allocation counters per-order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170706144634.GB14840@castle>
Date:   Thu, 6 Jul 2017 15:46:34 +0100
From:   Roman Gushchin <guro@...com>
To:     Mel Gorman <mgorman@...hsingularity.net>
CC:     <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...e.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Rik van Riel <riel@...hat.com>, <kernel-team@...com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: make allocation counters per-order

On Thu, Jul 06, 2017 at 02:19:41PM +0100, Mel Gorman wrote:
> On Thu, Jul 06, 2017 at 02:04:31PM +0100, Roman Gushchin wrote:
> > High-order allocations are obviously more costly, and it's very useful
> > to know how many of them happens, if there are any issues
> > (or suspicions) with memory fragmentation.
> > 
> > This commit changes existing per-zone allocation counters to be
> > per-zone per-order. These counters are displayed using a new
> > procfs interface (similar to /proc/buddyinfo):
> > 
> > $ cat /proc/allocinfo
> >      DMA          0          0          0          0          0 \
> >        0          0          0          0          0          0
> >    DMA32          3          0          1          0          0 \
> >        0          0          0          0          0          0
> >   Normal    4997056      23594      10902      23686        931 \
> >       23        122        786         17          1          0
> >  Movable          0          0          0          0          0 \
> >        0          0          0          0          0          0
> >   Device          0          0          0          0          0 \
> >        0          0          0          0          0          0
> > 
> > The existing vmstat interface remains untouched*, and still shows
> > the total number of single page allocations, so high-order allocations
> > are represented as a corresponding number of order-0 allocations.
> > 
> > $ cat /proc/vmstat | grep alloc
> > pgalloc_dma 0
> > pgalloc_dma32 7
> > pgalloc_normal 5461660
> > pgalloc_movable 0
> > pgalloc_device 0
> > 
> > * I've added device zone for consistency with other zones,
> > and to avoid messy exclusion of this zone in the code.
> > 
> 
> The alloc counter updates are themselves a surprisingly heavy cost to
> the allocation path and this makes it worse for a debugging case that is
> relatively rare. I'm extremely reluctant for such a patch to be added
> given that the tracepoints can be used to assemble such a monitor even
> if it means running a userspace daemon to keep track of it. Would such a
> solution be suitable? Failing that if this is a severe issue, would it be
> possible to at least make this a compile-time or static tracepoint option?
> That way, only people that really need it have to take the penalty.

I've tried to measure the difference with my patch applied and without
any accounting at all (__count_alloc_event() redefined to an empty function),
and I wasn't able to find any measurable difference.
Can you, please, provide more details, how your scenario looked like,
when alloc coutners were costly?

As new counters replace an old one, and both are per-cpu counters, I believe,
that the difference should be really small.

If there is a case, when the difference is meaningful,
I'll, of course, make the whole thing a compile-time option.

Thank you!