linux-kernel - Re: [1/6] 2.6.21-rc4: known regressions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0703220813340.6730@woody.linux-foundation.org>
Date:	Thu, 22 Mar 2007 08:21:11 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Nick Piggin <nickpiggin@...oo.com.au>,
	Mingming Cao <cmm@...ibm.com>
cc:	Adrian Bunk <bunk@...sta.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
	Mariusz Kozlowski <m.kozlowski@...land.pl>,
	Oliver Pinter <oliver.pntr@...il.com>,
	Sid Boyce <g3vbv@...eyonder.co.uk>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [1/6] 2.6.21-rc4: known regressions

On Thu, 22 Mar 2007, Nick Piggin wrote:
> 
> Nothing sleeps on PageUptodate, so I don't think that could explain it.

Good point. I forget that we just test "uptodate", but then always sleep 
on "locked".

> The fs: fix __block_write_full_page error case buffer submission patch
> does change the locking, but I'd be really suprised if that was the
> problem, because it changes locking to match the regular non-error path
> submission.

I'd agree, except something clearly has changed ;^)

> > Alternatively, maybe it really is an _io_ problem (and the buffer-head thing
> > is just a red herring, and it could happen to other IO, it's just that
> > metadata IO uses buffer heads), and it's the scheduler changes since
> > 2.6.20..
> 
> I see what you mean. Could it be an ext3 or jbd change I wonder?

jbd hasn't changed since 2.6.20, and the ext3 changes are mostly 
things like const'ness fixes. And others were things like changing 
"journal_current_handle()" into "ext3_journal_current_handle()", which 
looked exciting considering that the hung processes were waiting for the 
journal, but the fact is, that's just an inline function that just calls 
the old function, so..

But interestingly, there *is* a "EA block reference count racing fix" 
that does move a lock_buffer()/unlock_buffer() to cover a bigger area. It 
looks "obviously correct", but maybe there's a deadlock possibility there 
with ext3_forget() or something?

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/