lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1157950410.31071.402.camel@localhost.localdomain>
Date:	Mon, 11 Sep 2006 14:53:30 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Michael Chan <mchan@...adcom.com>
Cc:	Segher Boessenkool <segher@...nel.crashing.org>,
	netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
	Linux Kernel list <linux-kernel@...r.kernel.org>
Subject: Re: TG3 data corruption (TSO ?)


> Oh, we know about this.  The powerpc writel() used to have memory
> barriers in 2.4 kernels but not any more in 2.6 kernels.  Red Hat's
> version of tg3 has extra wmb()'s to fix this problem.  David doesn't
> think that the upstream version of tg3 should have these wmb()'s, and
> the problem should instead be fixed in powerpc's writel().

I've added a wmb() in tw32_rx_mbox() and tw32_tx_mbox() and can still
reproduce the problem. I've also done a 2 days run without TSO enabled
without a failure (my test program normally fails after a couple of
minutes).

Thus, do you see any other code path in the driver where a
synchronisation might be missing ? Is there any case where the chip
might use data in memory before it has been told to do so  with a
mailbox write ? (There are no "OWN" bits that I can see in the
descriptors, thus I doubt it will use a transmit descriptor that is
half-built before the store to the mailbox allows using it) but who
knows....

That leads to the question that there might be an unrelated bug in the
driver. Segher thinks we might be overriding "live" descriptors, though
I haven't seen how yet. It seems to be TSO specific tho... maybe some
missing smp synchronisation in the driver itself or a problem when the
TX ring is full ?

I don't have the chip docs and I'm not familiar with the driver, so I'll
keep looking, but advice is welcome. I'll also see if I can reproduce
with some other TSO capable card, in case the problem is in the kernel
TSO code and not in the driver.

Cheers,
Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ