lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080618031406.GA4326@brong.net>
Date:	Wed, 18 Jun 2008 13:14:06 +1000
From:	Bron Gondwana <brong@...tmail.fm>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Bron Gondwana <brong@...tmail.fm>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rob Mueller <robm@...tmail.fm>,
	Andi Kleen <andi@...stfloor.org>, Ingo Molnar <mingo@...e.hu>
Subject: Re: BUG: mmapfile/writev spurious zero bytes (x86_64/not i386,
	bisected, reproducable)

On Tue, Jun 17, 2008 at 02:20:49PM -0700, Linus Torvalds wrote:
> On Tue, 17 Jun 2008, Linus Torvalds wrote:
> > 
> > Hmm. Something like this *may* salvage it.
> > 
> > Untested, so far (I'll reboot and test soon enough), but even if it fixes 
> > things, it's not really very good. 
> 
> Ok, so I just rebooted with this, and it does indeed fix the bug.
> 
> I'd be happier with a more complete fix (ie being byte-accurate and 
> actually doing the partial copy when it hits a fault in the middle), but 
> this seems to be the minimal fix, and at least fixes the totally bogus 
> return values from the x86-64 __copy_user*() functions.
> 
> Not that I checked that I got _all_ cases correct (and maybe there are 
> other versions of __copy_user that I missed entirely), but Bron's 
> test-case at least seems to work properly for me now.
> 
> Bron? If you have a more complete test-suite (ie the real-world case that 
> made you find this), it would be good to verify the whole thing. 

Ok - I pulled the latest linus-2.6 git, and discovered 
the patch was already in there, so I just built and
rebooted (git 952f4a0a9b27e6dbd5d32e330b3f609ebfa0b061).

Confirmed - fixed in both the test code and the cyr_dbtool 
test case I was using previously (I would have posted that 
instead, but building cyrus is a bit of pain.  You need 
bdb and sasl and all sorts of extraneous crap - and 
cyrusdb_skiplist.c depends on about half of Cyrus' 
infrastructure, so I couldn't just pull it out by itself)

For my sins, I appear to be becoming the world expert on 
that particular file.  I've debugged skiplist bugs many 
times over, and completely rewritten the locking code.  
It really does some pretty evil things - the memory accesses
look something like this:

[file...................]
[mmap^....^.^........^^..................................]
[file...................++++++++++++]
[mmap^....^.^........^^.^^ ^      ^^.....................]

Where (^) is the bits that get accessed.  All reads are via
the mmap, all writes are done with retry_write or 
retry_writev (Cyrus library functions that keep hammering
until all the bytes are written)

I was suspecting as early as Friday night (we've been 
debugging this one for a few days now!) that it was page 
break related, because the bug only seemed to be appearing
on seen databases with really long seen lists (they're in 
ranged integer format like 1:5,7:9,12,14:22,24:...).  

It didn't help that at first we were only finding out about
cases where the corruption hit exactly on the "navigational
components", hence breaking the skiplist logic.  And then 
the backpointer writes would scribble all over the corrupt 
area as well, so that made it even stranger to debug!


OK - so I'll report this issue to the Cyrus mailing list.
Warn people not to run on kernels 2.6.23 -> 2.6.25.7 with
x86_64 kernels.  At least not without the skanky little
patch that I'm planning to post:

int magic = 0;
for (i = 0; i < maplen; i++) magic ^= mapbase[i];

Since I've tested that as a viable workaround!

Bron.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ