linux-kernel - Re: [RFC PATCH 00/30] Code tagging framework and applications

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220901223720.e4gudprscjtwltif@moria.home.lan>
Date:   Thu, 1 Sep 2022 18:37:20 -0400
From:   Kent Overstreet <kent.overstreet@...ux.dev>
To:     Roman Gushchin <roman.gushchin@...ux.dev>
Cc:     Yosry Ahmed <yosryahmed@...gle.com>,
        Michal Hocko <mhocko@...e.com>, Mel Gorman <mgorman@...e.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>, dave@...olabs.net,
        Matthew Wilcox <willy@...radead.org>, liam.howlett@...cle.com,
        void@...ifault.com, juri.lelli@...hat.com, ldufour@...ux.ibm.com,
        Peter Xu <peterx@...hat.com>,
        David Hildenbrand <david@...hat.com>, axboe@...nel.dk,
        mcgrof@...nel.org, masahiroy@...nel.org, nathan@...nel.org,
        changbin.du@...el.com, ytcoode@...il.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        Steven Rostedt <rostedt@...dmis.org>, bsegall@...gle.com,
        bristot@...hat.com, vschneid@...hat.com,
        Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>, 42.hyeyoo@...il.com,
        glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <songmuchun@...edance.com>, arnd@...db.de,
        jbaron@...mai.com, David Rientjes <rientjes@...gle.com>,
        minchan@...gle.com, kaleshsingh@...gle.com,
        kernel-team@...roid.com, Linux-MM <linux-mm@...ck.org>,
        iommu@...ts.linux.dev, kasan-dev@...glegroups.com,
        io-uring@...r.kernel.org, linux-arch@...r.kernel.org,
        xen-devel@...ts.xenproject.org, linux-bcache@...r.kernel.org,
        linux-modules@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 00/30] Code tagging framework and applications

On Thu, Sep 01, 2022 at 03:27:27PM -0700, Roman Gushchin wrote:
> On Wed, Aug 31, 2022 at 01:56:08PM -0700, Yosry Ahmed wrote:
> > This is very interesting work! Do you have any data about the overhead
> > this introduces, especially in a production environment? I am
> > especially interested in memory allocations tracking and detecting
> > leaks.
> 
> +1
> 
> I think the question whether it indeed can be always turned on in the production
> or not is the main one. If not, the advantage over ftrace/bpf/... is not that
> obvious. Otherwise it will be indeed a VERY useful thing.

Low enough overhead to run in production was my primary design goal.

Stats are kept in a struct that's defined at the callsite. So this adds _no_
pointer chasing to the allocation path, unless we've switch to percpu counters
at that callsite (see the lazy percpu counters patch), where we need to deref
one percpu pointer to save an atomic.

Then we need to stash a pointer to the alloc_tag, so that kfree() can find it.
For slab allocations this uses the same storage area as memcg, so for
allocations that are using that we won't be touching any additional cachelines.
(I wanted the pointer to the alloc_tag to be stored inline with the allocation,
but that would've caused alignment difficulties).

Then there's a pointer deref introduced to the kfree() path, to get back to the
original alloc_tag and subtract the allocation from that callsite. That one
won't be free, and with percpu counters we've got another dependent load too -
hmm, it might be worth benchmarking with just atomics, skipping the percpu
counters.

So the overhead won't be zero, I expect it'll show up in some synthetic
benchmarks, but yes I do definitely expect this to be worth enabling in
production in many scenarios.