linux-ext4 - Re: DIO process stuck apparently due to dioread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4E48E0D0.3090005@msgid.tls.msk.ru>
Date:	Mon, 15 Aug 2011 13:03:12 +0400
From:	Michael Tokarev <mjt@....msk.ru>
To:	Tao Ma <tm@....ma>
CC:	linux-ext4@...r.kernel.org, sandeen@...hat.com,
	Jan Kara <jack@...e.cz>
Subject: Re: DIO process stuck apparently due to dioread_nolock (3.0)

15.08.2011 12:56, Michael Tokarev пишет:
> 15.08.2011 12:00, Michael Tokarev wrote:
> [....]
> 
> So, it looks like this (starting with cold cache):
> 
> 1. rename the redologs and copy them over - this will
>    make a hot copy of redologs
> 2. startup oracle - it will complain that the redologs aren't
>    redologs, the header is corrupt
> 3. shut down oracle, start it up again - it will succeed.
> 
> If between 1 and 2 you'll issue sync(1) everything will work.
> When shutting down, oracle calls fsync(), so that's like
> sync(1) again.
> 
> If there will be some time between 1. and 2., everything
> will work too.
> 
> Without dioread_nolock I can't trigger the problem no matter
> how I tried.
> 
> 
> A smaller test case.  I used redo1.odf file (one of the
> redologs) as a test file, any will work.
> 
>  $ cp -p redo1.odf temp
>  $ dd if=temp of=foo iflag=direct count=20
> 
> Now, first 512bytes of "foo" will contain all zeros, while
> the beginning of redo1.odf is _not_ zeros.
> 
> Again, without aioread_nolock it works as expected.
> 
> 
> And the most important note: without the patch there's no
> data corruption like that.  But instead, there is the
> lockup... ;)

Actually I can reproduce this data corruption without the
patch too, just not that easily.  Oracle testcase (with
copying redologs over) does that nicely.  So that's a
separate bug which was here before.

> Thank you,
> 
> /mjt

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html