netdev - Re: [Bugme-new] [Bug 38102] New: BUG kmalloc-2048: Poison overwritten

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110706181258.57b3c112@maggie>
Date:	Wed, 6 Jul 2011 18:12:58 +0200
From:	Michael Büsch <m@...s.ch>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Neil Horman <nhorman@...driver.com>,
	Alexey Zaytsev <alexey.zaytsev@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	netdev@...r.kernel.org, Gary Zambrano <zambrano@...adcom.com>,
	bugme-daemon@...zilla.kernel.org,
	"David S. Miller" <davem@...emloft.net>,
	Pekka Pietikainen <pp@...oulu.fi>,
	Florian Schirmer <jolt@...box.org>,
	Felix Fietkau <nbd@...nwrt.org>, Michael Buesch <mb@...sch.de>
Subject: Re: [Bugme-new] [Bug 38102] New: BUG kmalloc-2048: Poison
 overwritten

On Wed, 06 Jul 2011 18:00:19 +0200
Eric Dumazet <eric.dumazet@...il.com> wrote:
> Not exactly :
> 
> If we read one skb at descriptor 0, we prepare a new buffer on slot 200,
> and advance DMARX_PTR to 201*sizeof(descriptor).

I don't think so. Why do you think this is the case?
We allocate a new descriptor buffer for the consumed buffer at exactly
the same place (which is "cons").
(Alternatively, we leave the buffer in place, and just copy the data to a new buffer).
And DMARX_PTR is updated to the last "cons", which is one beyond
the last buffer that we consumed (and pushed up the net stack).

(Note that "beyond" always means "beyond" in the sense of a DMA _ring_,
which honors the EOT bit. This is done by masking cons with RX_RING_SIZE-1,
which is thus assumed to be a power of two).

> > I don't know why b44 sets the DMARX_PTR to 200 initially (which is 40
> > descriptors, as this is a byte pointer). This seems kind of arbitrary.
> > In b43 we set it to (NR_OF_DESCRIPTORS - 1) * sizeof(descriptor).
> > But I don't think it really matters. It only limits the usable DMA
> > area before the first interrupt (or poll) occurs. After the final
> > B44_DMARX_PTR write in b44_rx(), the full descriptor range (well, minus one)
> > will be usable.
> 
> Yes, this is probably a small bug, we should fix it for correctness.

I wouldn't call this a bug.

> > Summary: I don't see where the DMA engine code is broken (except for
> > the minor missing wmb(), which doesn't trigger this memory corruption, though)
> > 
> > I hope that helps to clear up stuff...
> 
> Well, you describe (nicely, thanks !) your understanding of how work the
> driver and chip.
> 
> Problem is we suspect a wrong statement or wrong hardware ;)
>
> Another problem is Alexey doesnt answer anymore, and I dont have this
> (old) hardware...

I do really think that either
1) His device is broken. Chips break. I already saw several devices
with HND DMA engine that broke down after some time of use.
or
2) The bug is at another place, but not in the lowlevel DMA handling.

Could this possibly be some kind of context issue? threaded IRQs?
The net subsys was rather picky about context, last time I looked
into it. see the .._ni() style functions.

> Other point : Do you know why b44_get_ringparam() doesnt  set
> ering->tx_max_pending and ering->tx_pending
> 
> The comment seems wrong :
> /* XXX ethtool lacks a tx_max_pending, oops... */

Well, I _guess_ that ering->tx_max_pending was added later? (I didn't
even check if it's there _now_, though.)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html