lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Apr 2009 11:38:00 +0900
From:	nishimura@....nes.nec.co.jp
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
	"hugh@...itas.com" <hugh@...itas.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	nishimura@....nes.nec.co.jp
Subject: Re: [PATCH] fix leak of swap accounting as stale swap cache under
 memcg

> On Tue, 28 Apr 2009 10:09:30 +0900
> nishimura@....nes.nec.co.jp wrote:
> 
>> > On Mon, 27 Apr 2009 21:08:56 +0900
>> > Daisuke Nishimura <d-nishimura@....biglobe.ne.jp> wrote:
>> > 
>> >> > Index: mmotm-2.6.30-Apr24/mm/vmscan.c
>> >> > ===================================================================
>> >> > --- mmotm-2.6.30-Apr24.orig/mm/vmscan.c
>> >> > +++ mmotm-2.6.30-Apr24/mm/vmscan.c
>> >> > @@ -661,6 +661,9 @@ static unsigned long shrink_page_list(st
>> >> >  		if (PageAnon(page) && !PageSwapCache(page)) {
>> >> >  			if (!(sc->gfp_mask & __GFP_IO))
>> >> >  				goto keep_locked;
>> >> > +			/* avoid making more stale swap caches */
>> >> > +			if (memcg_stale_swap_congestion())
>> >> > +				goto keep_locked;
>> >> >  			if (!add_to_swap(page))
>> >> >  				goto activate_locked;
>> >> >  			may_enter_fs = 1;
>> >> > 
>> >> Well, as I mentioned before(http://marc.info/?l=linux-kernel&m=124066623510867&w=2),
>> >> this cannot avoid type-2(set !PageCgroupUsed by the owner process via
>> >> page_remove_rmap()->mem_cgroup_uncharge_page() before being added to swap cache).
>> >> If these swap caches go through shrink_page_list() without beeing freed
>> >> for some reason, these swap caches doesn't go back to memcg's LRU.
>> >> 
>> >> Type-2 doesn't pressure memsw.usage, but you can see it by plotting
>> >> "grep SwapCached /proc/meminfo".
>> >> 
>> >> And I don't think it's a good idea to add memcg_stale_swap_congestion() here.
>> >> This means less possibility to reclaim pages.
>> >> 
>> > Hmm. maybe adding congestion_wait() ?
>> > 
>> I don't think no hook before add_to_swap() is needed.
>> 
>> >> Do you dislike the patch I attached in the above mail ?
>> >> 
>> > I doubt whether the patch covers all type-2 case.
>> > 
>> hmm, I didn't see any leak anymore when I tested the patch.
>> 
> 
> At first, your patch
> ==
>  		if (PageAnon(page) && !PageSwapCache(page)) {
>  			if (!(sc->gfp_mask & __GFP_IO))
>  				goto keep_locked;
> -			/* avoid making more stale swap caches */
> -			if (memcg_stale_swap_congestion())
> -				goto keep_locked;
>  			if (!add_to_swap(page))
>  				goto activate_locked;
> +			/*
> +			 * The owner process might have uncharged the page
> +			 * (by page_remove_rmap()) before it has been added
> +			 * to swap cache.
> +			 * Check it here to avoid making it stale.
> +			 */
> +			if (memcg_free_unused_swapcache(page))
> +				goto keep_locked;
>  			may_enter_fs = 1;
>  		}
> ==
> Should be
> ==
> 
> 	if (PageAnon(page) && !PageSwapCache(page)) {
> 		... // don't modify here
> 	}
> 	if (PageAnon(page) && PageSwapCache(page) && !page_mapped(page)) {
> 		if (try_to_free_page(page)) // or memcg_free_unused_swapcache()
> 			goto free_it;
> 	}
> ==
> I think.
> 
It may work too.

But if the page is on swap cache already at the point of page_remove_rmap()
-> mem_cgroup_uncharge_page, the page is not uncharged.
So, it can be freed in memcg's LRU scanning in the long run by
shrink_page_list()->pageout()->swap_writepage()->try_to_free_swap().

I added the hook there just because I wanted to clarify what the
problematic case is.

And I don't think "goto free_it" is good.
It calls free_hot_cold_page(), but some process (like swapoff) might
have got the swap cache already and be waiting for the lock of the page.

> And we need hook to free_swap_and_cache() for handling PageWriteback() case.
> 
Ah, You're right.


Thanks,
Daisuke Nishimura.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ