lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160602151149.GT1995@dhcp22.suse.cz>
Date:	Thu, 2 Jun 2016 17:11:49 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: zone_reclaimable() leads to livelock in __alloc_pages_slowpath()

On Wed 01-06-16 23:38:30, Oleg Nesterov wrote:
> On 06/01, Michal Hocko wrote:
> >
> > On Wed 01-06-16 01:56:26, Oleg Nesterov wrote:
> > > On 05/31, Michal Hocko wrote:
> > > >
> > > > On Sun 29-05-16 23:25:40, Oleg Nesterov wrote:
> > > > >
> > > > > This single change in get_scan_count() under for_each_evictable_lru() loop
> > > > >
> > > > > 	-	size = lruvec_lru_size(lruvec, lru);
> > > > > 	+	size = zone_page_state_snapshot(lruvec_zone(lruvec), NR_LRU_BASE + lru);
> > > > >
> > > > > fixes the problem too.
> > > > >
> > > > > Without this change shrink*() continues to scan the LRU_ACTIVE_FILE list
> > > > > while it is empty. LRU_INACTIVE_FILE is not empty (just a few pages) but
> > > > > we do not even try to scan it, lruvec_lru_size() returns zero.
> > > >
> > > > OK, you seem to be really seeing a different issue than me.
> > >
> > > quite possibly, but
> > >
> > > > My debugging
> > > > patch was showing when nothing was really isolated from the LRU lists
> > > > (both for shrink_{in}active_list.
> > >
> > > in my debugging session too. LRU_ACTIVE_FILE was empty, so there is nothing to
> > > isolate even if shrink_active_list() is (wrongly called) with nr_to_scan != 0.
> > > LRU_INACTIVE_FILE is not empty but it is not scanned because nr_to_scan == 0.
> > >
> > > But I am afraid I misunderstood you, and you meant something else.
> >
> > What I wanted to say is that my debugging hasn't shown a single case
> > when nothing would be isolated. Which seems to be the case for you.
> 
> Ah, got it, thanks. Yes, I see that there is no "nothing scanned" in
> oom-test.qcow_serial.log.gz from http://marc.info/?l=linux-kernel&m=146417822608902
> you sent. I applied this patch and I do see "nothing scanned".
> 
> But, unlike you, I do not see the messages from free-pages... perhaps you
> have more active tasks. To remind, I tested this with the single user-space
> process, /bin/sh running with pid==1, then I did "while true; do ./oom; done".

Well, I was booting into a standard init which will have a couple of
processes. So yes this would make a slight difference.
 
> So of course I do not know if you see another issue or the same, but now I am
> wondering if the change in get_scan_count() above fixes the problem for you.

I have played with it but the interfering freed pages just ruined the
whole zone_reclaimable expectations.
 
> Probably not, but the fact you do not see "nothing scanned" can't prove this,
> it is possible that shrink_*_list() was not called because vm_stat == 0 but
> zone_reclaimable() sees the per-cpu counter. In this case 0db2cb8da89d can
> make a difference, but see below.
> 
> > > > But I am thinking whether we should simply revert 0db2cb8da89d ("mm,
> > > > vmscan: make zone_reclaimable_pages more precise") in 4.6 stable tree.
> > > > Does that help as well?
> > >
> > > I'll test this tomorrow,
> 
> So it doesn't help.

OK, so we at least know this is not a regression.

> > but even if it helps I am not sure... Yes, this
> > > way zone_reclaimable() and get_scan_count() will see the same numbers, but
> > > how this can help to make zone_reclaimable() == F at the end?
> >
> > It won't in some cases.
> 
> And unless I am notally confused  hit exactly this case.
> 
> > And that has been the case for ages so I do not
> > think we need any steps for the stable.
> 
> OK, agreed.
> 
> > What meant to address is a
> > potential regression caused by 0db2cb8da89d which would make this more
> > likely because of the mismatch
> 
> Again, I can be easily wrong, but I do not see how 0db2cb8da89d could make
> the things worse...
> 
> Unless both get_scan_count() and zone_reclaimable() use "snapshot" variant,
> we can't guarantee zone_reclaimable() becomes false. The fact that they see
> different numbers (after 0db2cb8da89d) doesn't really matter.
> 
> Anyway, this was already fixed, so lets forget it ;)

Yes, especially as this doesn't seem to be a regression.

Thanks for your effort anyway.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ