linux-ext4 - Re: [PATCH v2] fs/mbcache: make sure mb_cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180108161304.f4be912fb6e20cdf56ae78ef@linux-foundation.org>
Date:   Mon, 8 Jan 2018 16:13:04 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Jiang Biao <jiang.biao2@....com.cn>
Cc:     linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
        tytso@....edu, ebiggers@...gle.com, jack@...e.cz,
        zhong.weidong@....com.cn
Subject: Re: [PATCH v2] fs/mbcache: make sure mb_cache_count() not return
 negative value.

On Tue, 9 Jan 2018 07:38:11 +0800 Jiang Biao <jiang.biao2@....com.cn> wrote:

> When running ltp stress test for 7*24 hours, vmscan occasionally emits the
> following warning continuously:
> 
> mb_cache_scan+0x0/0x3f0 negative objects to delete
> nr=-9232265467809300450
> ....
> 
> Trace info shows the freeable(mb_cache_count returns) is -1, which causes
> the continuous accumulation and overflow of total_scan.
> 
> This patch makes sure that mb_cache_count() not return a negative value,
> which makes the mbcache shrinker more robust.
> 
> ...
>
> --- a/fs/mbcache.c
> +++ b/fs/mbcache.c
> @@ -238,7 +238,11 @@ void mb_cache_entry_delete(struct mb_cache *cache, u32 key, u64 value)
>  			spin_lock(&cache->c_list_lock);
>  			if (!list_empty(&entry->e_list)) {
>  				list_del_init(&entry->e_list);
> -				cache->c_entry_count--;
> +				if (cache->c_entry_count > 0)
> +					cache->c_entry_count--;
> +				else
> +					WARN_ONCE(1, "mbcache: Entry count "
> +                          "going negative!\n");
>  				atomic_dec(&entry->e_refcnt);
>  			}
>  			spin_unlock(&cache->c_list_lock);

I agree with Jan's comment.  We need to figure out how ->c_entry_count
went negative.  mb_cache_count() says this state is "Unlikely, but not
impossible", but from a quick read I can't see how this happens - it
appears that coherency between ->c_list and ->c_entry_count is always
maintained under ->c_list_lock?