lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.00.1206031459450.15427@eggly.anvils>
Date:	Sun, 3 Jun 2012 15:17:36 -0700 (PDT)
From:	Hugh Dickins <hughd@...gle.com>
To:	Dave Jones <davej@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	Kyungmin Park <kyungmin.park@...sung.com>,
	Marek Szyprowski <m.szyprowski@...sung.com>,
	Mel Gorman <mgorman@...e.de>, Minchan Kim <minchan@...nel.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Cong Wang <amwang@...hat.com>,
	Markus Trippelsdorf <markus@...ppelsdorf.de>
cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: WARNING: at mm/page-writeback.c:1990
 __set_page_dirty_nobuffers+0x13a/0x170()

On Sun, 3 Jun 2012, Dave Jones wrote:
> On Sun, Jun 03, 2012 at 02:31:39PM -0400, Dave Jones wrote:
>  > On Sun, Jun 03, 2012 at 11:23:29AM -0700, Linus Torvalds wrote:
>  >  > On Sun, Jun 3, 2012 at 11:15 AM, Dave Jones <davej@...hat.com> wrote:
>  >  > >
>  >  > > Things aren't happy with that patch at all.
>  >  > 
>  >  > Yeah, at this point I think we need to just revert the compaction changes.
>  >  > 
>  >  > Guys, what's the minimal set of commits to revert? That clearly buggy
>  >  > "rescue_unmovable_pageblock()" function was introduced by commit
>  >  > 5ceb9ce6fe94, but is that actually involved with the particular bug?
>  >  > That commit seems to revert cleanly still, but is that sufficient or
>  >  > does it even matter?
>  > 
>  > I'l rerun the test with that (and Hugh's last patch) backed out, and see
>  > if that makes any difference.
> 
> running just over two hours with that commit reverted with no obvious ill effects so far.

Yes, and I ran happily with precisely that commit reverted on Friday -
though I've never got the list corruption that you saw with it in.  

The locking bug certainly comes in with that commit, it's an isolated
commit that reverts cleanly, and I think you got the list corruption
rather sooner than two hours before (9min, 30min, 41min from the traces
you sent).

Maybe we should let you run a little longer, or wait for others to comment.

But another strike against that commit: I tried fixing it up to use
start_page instead of page at the end, with the worrying but safer
locking I suggested at first, with a count of how many times it went
there, and how many times it succeeded.

While I ran my usual swapping test (perhaps that's a very unfair test
to run on this, I've no idea) for seven hours, it went there 25406
times (once per second, it appears) and it succeeded... 0 times.

Let's hope it failed quickly each time, I wasn't capturing that.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ