linux-kernel - Re: MIPS: BUG() in isolate_lru

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <553BE2A9.2090500@gentoo.org>
Date:	Sat, 25 Apr 2015 14:53:29 -0400
From:	Joshua Kinard <kumba@...too.org>
To:	LKML <linux-kernel@...r.kernel.org>,
	Linux MIPS List <linux-mips@...ux-mips.org>
Subject: Re: MIPS: BUG() in isolate_lru_pages in mm/vmscan.c?

On 04/25/2015 11:56, Joshua Kinard wrote:
> I keep tripping up a BUG() in isolate_lru_pages in mm/vmscan.c:1345:
> 
> 	switch (__isolate_lru_page(page, mode)) {
> 	case 0:
> 		nr_pages = hpage_nr_pages(page);
> 		mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
> 		list_move(&page->lru, dst);
> 		nr_taken += nr_pages;
> 		break;
> 
> 	case -EBUSY:
> 		/* else it is being freed elsewhere */
> 		list_move(&page->lru, src);
> 		continue;
> 
> 	default:
> 		BUG();
> 	}
> 
> This is on an SGI Onyx2 platform (MIPS, IP27), two node boards (4x R14000
> CPUs), and 8G of RAM.  The problem appears tied to heavy disk I/O, typically
> writes.  I can reproduce sometimes with a long bonnie++ run, but I haven't
> gotten a recent panic() message under 4.0 yet.  Most of the time, it silently
> hardlocks.  I only have serial console access at 9600bps, so it may lock too
> fast before the serial driver can dump the panic.
> 
> Is there any information behind the purpose or triggers of this BUG()?  I went
> back in git all the way to the initial 2006 commit that added this function,
> but could not find any comments or explanation of just what it's protecting
> against.  That makes it hard to know where to start debugging.
> 
> I've already tried switching filesystems, first ext4, now XFS.  Enabling
> CONFIG_NUMA seems to make it harder to trigger, but that's not an objective
> observation.  An md RAID resync doesn't appear to trigger it either.


This patch seems to explain things a little bit (from 20070316):
http://marc.info/?l=linux-mm-commits&m=117401513810763&w=2

> Subject: lumpy: back out removal of active check in isolate_lru_pages
> From: Andy Whitcroft <apw@...dowen.org>
> 
> As pointed out by Christop Lameter it should not be possible for a page to
> change its active/inactive state without taking the lru_lock.  Reinstate this
> safety net.
> 
> Signed-off-by: Andy Whitcroft <apw@...dowen.org>
> Acked-by: Mel Gorman <mel@....ul.ie>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> ---
> 
>  mm/vmscan.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff -puN mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages mm/vmscan.c
> --- a/mm/vmscan.c~lumpy-back-out-removal-of-active-check-in-isolate_lru_pages
> +++ a/mm/vmscan.c
> @@ -686,10 +686,13 @@ static unsigned long isolate_lru_pages(u
>  			nr_taken++;
>  			break;
>  
> -		default:
> -			/* page is being freed, or is a missmatch */
> +		case -EBUSY:
> +			/* else it is being freed elsewhere */
>  			list_move(&page->lru, src);
>  			continue;
> +
> +		default:
> +			BUG();
>  		}
>  
>  		if (!order)

So if my reading is correct, the BUG() is being triggered because a page might
be changing its active/inactive state w/o taking the lru_lock.  Given that the
SGI IP27 platform is an early NUMA machine and nodes can have a bit of physical
distance between them (thus some latency), could this be a sign of some kind of
SMP race condition specific to this platform?

--J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/