[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160524224341.GA11961@redhat.com>
Date: Wed, 25 May 2016 00:43:41 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Mel Gorman <mgorman@...hsingularity.net>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: zone_reclaimable() leads to livelock in __alloc_pages_slowpath()
On 05/24, Michal Hocko wrote:
>
> On Mon 23-05-16 17:14:19, Oleg Nesterov wrote:
> > On 05/23, Michal Hocko wrote:
> [...]
> > > Could you add some tracing and see what are the numbers
> > > above?
> >
> > with the patch below I can press Ctrl-C when it hangs, this breaks the
> > endless loop and the output looks like
> >
> > vmscan: ZONE=ffffffff8189f180 0 scanned=0 pages=6
> > vmscan: ZONE=ffffffff8189eb00 0 scanned=1 pages=0
> > ...
> > vmscan: ZONE=ffffffff8189eb00 0 scanned=2 pages=1
> > vmscan: ZONE=ffffffff8189f180 0 scanned=4 pages=6
> > ...
> > vmscan: ZONE=ffffffff8189f180 0 scanned=4 pages=6
> > vmscan: ZONE=ffffffff8189f180 0 scanned=4 pages=6
> >
> > the numbers are always small.
>
> Small but scanned is not 0 and constant which means it either gets reset
> repeatedly (something gets freed) or we have stopped scanning. Which
> pattern can you see? I assume that the swap space is full at the time
> (could you add get_nr_swap_pages() to the output).
no, I tested this without SWAP,
> Also zone->name would
> be better than the pointer.
Yes, forgot to mention, this is DMA32. To remind, only 512m of RAM so
this is natural.
> I am trying to reproduce but your test case always hits the oom killer:
Did you try to run it in a loop? Usually it takes a while before the system
hangs.
> Swap: 138236 57740 80496
perhaps this makes a difference? See above, I have no SWAP.
So. I spent almost the whole day trying to understand whats going on, and
of course I failed.
But. It _seems to me_ that the kernel "leaks" some pages in LRU_INACTIVE_FILE
list because inactive_file_is_low() returns the wrong value. And do not even
ask me why I think so, unlikely I will be able to explain ;) to remind, I never
tried to read vmscan.c before.
But. if I change lruvec_lru_size()
- return zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru);
+ return zone_page_state_snapshot(lruvec_zone(lruvec), NR_LRU_BASE + lru);
the problem goes away too.
To remind, it also goes away if I change calculate_normal_threshold() to return
zero, and it was not clear why. Now we can probably conclude that that this is
because the change obviouslt affects lruvec_lru_size().
Oleg.
Powered by blists - more mailing lists