lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1428865466.25985.366.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Sun, 12 Apr 2015 12:04:26 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Felix Fietkau <nbd@...nwrt.org>
Cc:	netdev@...r.kernel.org, zajec5@...il.com, hauke@...ke-m.de
Subject: Re: [PATCH v2 4/4] bgmac: fix DMA rx corruption

On Sun, 2015-04-12 at 20:46 +0200, Felix Fietkau wrote:
> On 2015-04-12 20:09, Eric Dumazet wrote:

> > 
> > Given the fact that previous driver kind of worked, have you tried not
> > doing the modulo at all here ?
> The previous driver was causing reproducible DMA corruption issues under
> load. My patch fixes this, and I've verified this with hours of stress
> testing (previously I could reproduce the issue in minutes).
> 
> > ring->end = desc_idx + 1;
> > 
> > It looks like the hardware accepted setting BGMAC_DMA_RX_INDEX to
> > ring->index_base + ring->num_slots * sizeof(struct bgmac_dma_desc)
> The hardware accepts it, but with that value, it considers all rx slot
> entries filled, there's no CPU ownership bit in the descriptor.
> According to documentation, that register indicates the first *invalid*
> rx descriptor (i.e. one where the CPU hasn't posted a valid buffer yet).
> 
> > So apparently the presence of BGMAC_DESC_CTL0_EOT in last rx desc was
> > enough to avoid any problem with this.
> EOT serves a completely different purpose. It simply indicates the end
> of the ring, telling the hardware to go back to the beginning for the
> next descriptor.

Right, but after your patch, we program the hardware with end == start
== 0

So I do not really understand how the hardware can still deliver first
frame.

If it was really taking care of 'end' value, then it should not even
deliver one frame.

The dma corruption happened because hardware was simply reusing frames,
that driver already took to upper stack, since as you said, there is no
ownership bit.

You might try your patch with BGMAC_RX_RING_SLOTS = 3 and check if it is
still working properly.

I find BGMAC_RX_RING_SLOTS==511 very suspect, and a comment of why this
magic value is not 512 would be nice.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ