linux-kernel - Re: [PATCH] fs/mbcache: make count

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Date:   Mon, 8 Jan 2018 10:21:15 +0100
From:   Jan Kara <jack@...e.cz>
To:     jiang.biao2@....com.cn
Cc:     jack@...e.cz, viro@...iv.linux.org.uk,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        zhong.weidong@....com.cn
Subject: Re: [PATCH] fs/mbcache: make count_objects more robust.

On Fri 05-01-18 08:54:56, jiang.biao2@....com.cn wrote:
> > On Mon 27-11-17 11:30:19, Jiang Biao wrote:
> > > When running ltp stress test for 7*24 hours, the vmscan occasionally
> > > complains the following warning continuously,
> >> 
> >>  mb_cache_scan+0x0/0x3f0 negative objects to delete
> >>  nr=-9232265467809300450
> >>  ...
> >> 
> >>  The tracing result shows the freeable(mb_cache_count returns)
> >>  is -1, which causes the continuous accumulation and overflow of
> >>  total_scan.
> >> 
> >>  This patch make sure the mb_cache_count not return negative value,
> >>  which make the mbcache shrinker more robust.
> >> 
> >>  Signed-off-by: Jiang Biao <jiang.biao2@....com.cn>
> > 
> > Going through some old email...
> > a) c_entry_count is unsigned so your patch is a nop as Coverity properly
> > noticed.
> Indeed, would the following casting be good?
> +    if (unlikely((int)(cache->c_entry_count) < 0))
> +        return 0;

That check would at least have a chance of hitting but still it is just
hiding the real problem.

> > b) c_entry_count being outside 0..2*cache->c_max_entries is a plain bug. I
> > went through the logic and cannot find out how that could happen though.
> Is there any possibility that decreasing c_entry_count from 0 to -1 
> in mb_cache_entry_delete?

If we think we have -1 entries in a list, we have a larger problem than
just the wrong behavior of the shrinker. This is just a plain counter of
entries protected by a spinlock so there isn't space for accounting errors
or anything like that. If you can reproduce the problem on some reasonably
recent kernel, I'd be interested in debugging this.

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR