lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 27 Jul 2013 01:24:30 +0100
From:	Ben Hutchings <bhutchings@...arflare.com>
To:	Neil Horman <nhorman@...driver.com>
CC:	<netdev@...r.kernel.org>, Jay Cliburn <jcliburn@...il.com>,
	"David S. Miller" <davem@...emloft.net>, <stable@...r.kernel.org>
Subject: Re: [net PATCH] atl1c: Fix misuse of netdev_alloc_skb in refilling
 rx ring

On Sat, 2013-07-27 at 01:02 +0100, Ben Hutchings wrote:
> On Fri, 2013-07-26 at 12:47 -0400, Neil Horman wrote:
> > atl1c uses netdev_alloc_skb to refill its rx dma ring, but that call makes no
> > guarantees about the suitability of the memory for use in DMA.  As a result
> > we've gotten reports of atl1c drivers occasionally hanging and needing to be
> > reset:
> > https://bugzilla.kernel.org/show_bug.cgi?id=54021
> > 
> > Fix this by modifying the call to use the internal version __netdev_alloc_skb,
> > where you can set the gfp_mask explicitly to include GFP_DMA.
> 
> This is a really bad idea.  GFP_DMA means allocation from the ISA DMA
> region (< 16 MB).  pci_map_single() takes care of allocating a bounce
> buffer if necessary.
> 
> Ben.
> 
> > Tested by two reporters in the above bug, who have the hardware to validate it.
> > Both report immediate cessation of the problem with this patch
[...]

So perhaps the chip somehow fails to support a full 32-bit address
(which is the current DMA mask), though given that there are 64 address
bits in RX descriptors this seems unlikely.  And the most likely result
of that would be memory corruption, not a stall.

Alternately, perhaps more likely, there's something wrong with the
driver's error handling.  If atl1_alloc_rx_buffer() fails then the RX
queue could run dry.  Depending on how the hardware is designed, that
could result in a complete RX stall (no RX buffers available => no RX
completions => no attempt to allocate more RX buffers).

Maybe your change makes it less likely for atl1_alloc_rx_buffer() to
fail.  On a modern PC the (ISA) DMA zone is basically unused whereas
bounce buffers might be more contended.  Did you try adding some logging
for failure of pci_map_single()?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ