lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f14e71182ebf1520aeede06afb44af49ec6128a0.camel@mediatek.com>
Date:   Fri, 2 Sep 2022 15:36:08 +0800
From:   Kuyo Chang <kuyo.chang@...iatek.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
CC:     <major.chen@...sung.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        "Mel Gorman" <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Matthias Brugger <matthias.bgg@...il.com>,
        <wsd_upstream@...iatek.com>, <hongfei.tang@...sung.com>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-mediatek@...ts.infradead.org>
Subject: Re: [PATCH 1/1] sched/debug: fix dentry leak in
 update_sched_domain_debugfs

On Fri, 2022-09-02 at 08:58 +0200, Greg Kroah-Hartman wrote:
> On Fri, Sep 02, 2022 at 02:40:59PM +0800, Kuyo Chang wrote:
> > On Fri, 2022-09-02 at 07:26 +0200, Greg Kroah-Hartman wrote:
> > > On Fri, Sep 02, 2022 at 11:15:15AM +0800, Kuyo Chang wrote:
> > > > From: kuyo chang <kuyo.chang@...iatek.com>
> > > > 
> > > > [Syndrome]
> > > > Lowmemorykiller triggered while doing hotplug stress test as
> > > > below
> > > > cmd:
> > > > echo [0/1] > /sys/devices/system/cpu/cpu${index}/online
> > > > 
> > > > Rootcause:
> > > > Call trace of the slab owner & usage as below after hotplug
> > > > stress
> > > > test(4hr).
> > > > There exists dentry leak at update_sched_domain_debugfs.
> > > > 
> > > > Total size : 322000KB
> > > > <prep_new_page+44>:
> > > > <get_page_from_freelist+672>:
> > > > <__alloc_pages+304>:
> > > > <allocate_slab+144>:
> > > > <___slab_alloc+404>:
> > > > <__slab_alloc+60>:
> > > > <kmem_cache_alloc+1204>:
> > > > <alloc_inode+100>:
> > > > <new_inode+40>:
> > > > <__debugfs_create_file+172>:
> > > > <update_sched_domain_debugfs+824>:
> > > > <partition_sched_domains_locked+1292>:
> > > > <rebuild_sched_domains_locked+576>:
> > > > <cpuset_hotplug_workfn+1052>:
> > > > <process_one_work+584>:
> > > > <worker_thread+1008>:
> > > > 
> > > > [Solution]
> > > > Provided by Major Chen <major.chen@...sung.com> as below link.
> > > > 
> > 
> > 
https://lore.kernel.org/lkml/20220711030341epcms5p173848e98b13c09eb2fcdf2fd7287526a@epcms5p1/
> > > > update_sched_domain_debugfs() uses debugfs_lookup() to find
> > > > wanted
> > > > dentry(which has
> > > > been created by debugfs_create_dir() before), but not call
> > > > dput()
> > > > to return this dentry
> > > > back. This result in dentry leak even debugfs_remove() is
> > > > called.
> > > > 
> > > > [Test result]
> > > > Using below commands to check inode_cache & dentry leak.
> > > > cat /proc/slabinfo | grep -w inode_cache
> > > > cat /proc/slabinfo | grep -w dentry
> > > > 
> > > > With the patch, the inode_cache & dentry stays consistent
> > > > so the lowmemorykiller will not triggered anymore.
> > > > 
> > > > Fixes: 8a99b6833c88 ("sched: Move SCHED_DEBUG sysctl to
> > > > debugfs")
> > > > 
> > > > Signed-off-by: Major Chen <major.chen@...sung.com>
> > > > Signed-off-by: kuyo chang <kuyo.chang@...iatek.com>
> > > > Tested-by: kuyo chang <kuyo.chang@...iatek.com>
> > > > 
> > > > ---
> > > >  kernel/sched/debug.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > > > index bb3d63bdf4ae..4ffea2dc01da 100644
> > > > --- a/kernel/sched/debug.c
> > > > +++ b/kernel/sched/debug.c
> > > > @@ -412,11 +412,14 @@ void update_sched_domain_debugfs(void)
> > > >  
> > > >  	for_each_cpu(cpu, sd_sysctl_cpus) {
> > > >  		struct sched_domain *sd;
> > > > -		struct dentry *d_cpu;
> > > > +		struct dentry *d_cpu, *d_lookup;
> > > >  		char buf[32];
> > > >  
> > > >  		snprintf(buf, sizeof(buf), "cpu%d", cpu);
> > > > -		debugfs_remove(debugfs_lookup(buf, sd_dentry));
> > > > +		d_lookup = debugfs_lookup(buf, sd_dentry);
> > > > +		debugfs_remove(d_lookup);
> > > > +		if (!IS_ERR_OR_NULL(d_lookup))
> > > > +			dput(d_lookup);
> > > 
> > > That's odd, and means that something else is removing this file
> > > right
> > > after we looked it up?  Is there a missing lock here that should
> > > be
> > > used
> > > instead?
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > 
> > While doing cpu hotlug, the cpu_active_mask is changed, 
> > so it need to update_sched_domain_debugfs.
> > 
> > The original design is to recreate sd_dentry, so it doing
> > debugfs_remove and then debugfs_create_dir.
> > However, by debugfs_lookup function usage.
> > The returned dentry must be passed to dput() when it is no longer
> > needed to avoid dentry leak.
> 
> Eeeek, nice find!  I've been adding this pattern:
> 	debugfs_remove(debugfs_lookup(...));
> all over the place, and as you point out, that's wrong!
> 
> It's as if I didn't even read the documentation I wrote.
> 
> {sigh}
> 
> Ok, as this is going to be a very common pattern, how about we
> create:
> 	debugfs_lookup_and_remove()
> function that does the above logic all in one place and then we don't
> have to put that logic everywhere in the kernel.  My goal is for
> users
> of debugfs to never have to worry about anything about 'struct
> dentry'
> at all, and I really failed that goal here in a major way.
> 
> I can work on that this afternoon after I get some other things done,
> unless you want to do it now?
> 
> Again, very nice find, thank you for this.
> 

Thanks for your kindly support !
Please help to add debugfs_lookup_and_remove() and then we can use
this api to fix this denrty leak issue.


> greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ