lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ae3aa420808021501k2e871dc0y344dd7f9a7b80614@mail.gmail.com>
Date:	Sat, 2 Aug 2008 17:01:46 -0500
From:	"Linas Vepstas" <linasvepstas@...il.com>
To:	"John Stoffel" <john@...ffel.org>
Cc:	"Alistair John Strachan" <alistair@...zero.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: amd64 sata_nv (massive) memory corruption

2008/8/2 John Stoffel <john@...ffel.org>:
>>>>>> "Linas" == Linas Vepstas <linasvepstas@...il.com> writes:
>
> Linas> 2008/8/1 Alistair John Strachan <alistair@...zero.co.uk>:
>>> On Friday 01 August 2008 18:30:34 Linas Vepstas wrote:
>>>> Hi,
>>>>
>>>> I'm seeing strong, easily reproducible (and silent) corruption on a
>>>> sata-attached
>>>> disk drive on an amd64 board.  It might be the disk itself, but I
>>>> doubt it; googling
>>>> suggests that its somehow iommu-related but I cannot confirm this.
>
> Can you post the output of dmesg after a boot, so we can see which
> driver is being used?  I assume the new Libata stuff, but maybe you
> can also turn on debugging in there as well.  Stuff like SCSI_DEBUG
> (in the SCSI menus) might show us more details here.
>
> Also, have you tried a new SATA cable by any chance?  That's obviously
> the cheaper path than getting a new disk...

I took the problematic hard drive (and its cable) to another computer
with sata ports on it,  and ran my file-copy/compare/fsck tests there,
and saw no problems; so the drive itself and its cable get a clean bill
of health.

Then, rather stupidly, I flashed the latest BIOS for the motherboard
and now have a dead motherboard (it hangs on its way through BIOS,
well before the bootloader.)  So I'm off to buy a new mobo today.

I'll send the dmesg from the older boots later today, if all goes well.
I'm pretty sure I had the new libata on, and the old off -- but its
possible that the .config somehow managed to pull in parts of the
old libata code anyway. I say this because, besides the SATA, the
blown motherboard had an IDE connector in use, and I also had
another PCI IDE card plugged in and in use. I'm imagining that
perhaps the PCI IDE .config might have pulled in old code, maybe
via header file, and thus mangled some lock that the sata side
was using. Just a wild guess.  -- Most people on this mobo hadn't
seen problems, and unlike most people, I had the PCI IDE card
in it.

--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ