lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Jan 2021 15:22:22 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Roman Gushchin <guro@...com>, Matthew Wilcox <willy@...radead.org>,
        Mike Rapoport <rppt@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Andy Lutomirski <luto@...nel.org>,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        Catalin Marinas <catalin.marinas@....com>,
        Christopher Lameter <cl@...ux.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        David Hildenbrand <david@...hat.com>,
        Elena Reshetova <elena.reshetova@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        James Bottomley <jejb@...ux.ibm.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Mark Rutland <mark.rutland@....com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Michael Kerrisk <mtk.manpages@...il.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        Shuah Khan <shuah@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tycho Andersen <tycho@...ho.ws>, Will Deacon <will@...nel.org>,
        linux-api@...r.kernel.org, linux-arch@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-kselftest@...r.kernel.org, linux-nvdimm@...ts.01.org,
        linux-riscv@...ts.infradead.org, x86@...nel.org,
        Hagen Paul Pfeifer <hagen@...u.net>,
        Palmer Dabbelt <palmerdabbelt@...gle.com>
Subject: Re: [PATCH v16 08/11] secretmem: add memcg accounting

On Thu 28-01-21 06:05:11, Shakeel Butt wrote:
> On Wed, Jan 27, 2021 at 11:59 PM Michal Hocko <mhocko@...e.com> wrote:
> >
> > On Wed 27-01-21 10:42:13, Roman Gushchin wrote:
> > > On Tue, Jan 26, 2021 at 04:05:55PM +0100, Michal Hocko wrote:
> > > > On Tue 26-01-21 14:48:38, Matthew Wilcox wrote:
> > > > > On Mon, Jan 25, 2021 at 11:38:17PM +0200, Mike Rapoport wrote:
> > > > > > I cannot use __GFP_ACCOUNT because cma_alloc() does not use gfp.
> > > > > > Besides, kmem accounting with __GFP_ACCOUNT does not seem
> > > > > > to update stats and there was an explicit request for statistics:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/CALo0P13aq3GsONnZrksZNU9RtfhMsZXGWhK1n=xYJWQizCd4Zw@mail.gmail.com/
> > > > > >
> > > > > > As for (ab)using NR_SLAB_UNRECLAIMABLE_B, as it was already discussed here:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/20201129172625.GD557259@kernel.org/
> > > > > >
> > > > > > I think that a dedicated stats counter would be too much at the moment and
> > > > > > NR_SLAB_UNRECLAIMABLE_B is the only explicit stat for unreclaimable memory.
> > > > >
> > > > > That's not true -- Mlocked is also unreclaimable.  And doesn't this
> > > > > feel more like mlocked memory than unreclaimable slab?  It's also
> > > > > Unevictable, so could be counted there instead.
> > > >
> > > > yes, that is indeed true, except the unreclaimable counter is tracking
> > > > the unevictable LRUs. These pages are not on any LRU and that can cause
> > > > some confusion. Maybe they shouldn't be so special and they should live
> > > > on unevistable LRU and get their stats automagically.
> > > >
> > > > I definitely do agree that this would be a better fit than NR_SLAB
> > > > abuse. But considering that this is somehow even more special than mlock
> > > > then a dedicated counter sounds as even better fit.
> > >
> > > I think it depends on how large these areas will be in practice.
> > > If they will be measured in single or double digits MBs, a separate entry
> > > is hardly a good choice: because of the batching the displayed value
> > > will be in the noise range, plus every new vmstat item adds to the
> > > struct mem_cgroup size.
> > >
> > > If it will be measured in GBs, of course, a separate counter is preferred.
> > > So I'd suggest to go with NR_SLAB (which should have been named NR_KMEM)
> > > as now and conditionally switch to a separate counter later.
> >
> > I really do not think the overall usage matters when it comes to abusing
> > other counters. Changing this in future will be always tricky and there
> > always be our favorite "Can this break userspace" question. Yes we dared
> > to change meaning of some counters but this is not generally possible.
> > Just have a look how accounting shmem as a page cache has turned out
> > being much more tricky than many like.
> >
> > Really if a separate counter is a big deal, for which I do not see any
> > big reason, then this should be accounted as unevictable (as suggested
> > by Matthew) and ideally pages of those mappings should be sitting in the
> > unevictable LRU as well unless there is a strong reason against.
> >
> 
> Why not decide based on the movability of these pages? If movable then
> unevictable LRU seems like the right way otherwise NR_SLAB.

I really do not follow. If the page is unevictable then why movability
matters? I also fail to see why NR_SLAB is even considered considering
this is completely outside of slab proper.

Really what is the point? What are we trying to achieve by stats? Do we
want to know how much secret memory is used because that is an
interesting/important information or do we just want to make some
accounting?

Just think at it from a practical point of view. I want to know how much
slab memory is used because it can give me an idea whether kernel is
consuming unexpected amount of memory. Now I have to subtract _some_
number to get that information. Where do I get that some number?

We have been creative with counters and it tends to kick back much more
often than it helps.

I really do not want this to turn into an endless bike shed but either
this should be accounted as a general type of memory (unevictable would
be a good fit because that is a userspace memory which is not
reclaimable) or it needs its own counter to tell how much of this
specific type of memory is used for this purpose.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ