lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 3 Mar 2014 07:24:40 +0000
From:	Sathya Perla <Sathya.Perla@...lex.Com>
To:	Ben Hutchings <ben@...adent.org.uk>
CC:	"jiang.biao2@....com.cn" <jiang.biao2@....com.cn>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Subramanian Seetharaman <subbu.seetharaman@...lex.com>,
	Ajit Khaparde <Ajit.Khaparde@...lex.Com>,
	"wang.liang82@....com.cn" <wang.liang82@....com.cn>,
	"cai.qu@....com.cn" <cai.qu@....com.cn>,
	"li.fengmao@....com.cn" <li.fengmao@....com.cn>,
	"long.chun@....com.cn" <long.chun@....com.cn>,
	David Miller <davem@...emloft.net>
Subject: RE: [PATCH] be2net: Bugfix for packet drop with kernel param
 swiotlb=force

> -----Original Message-----
> From: Ben Hutchings [mailto:ben@...adent.org.uk]
> 
> On Wed, 2014-02-26 at 04:54 +0000, Sathya Perla wrote:
> > > -----Original Message-----
> > > From: Ben Hutchings [mailto:ben@...adent.org.uk]
> > >
> > ...
> > > > > > >
> > > > > > > From: Li Fengmao <li.fengmao@....com.cn>
> > > > > > >
> > > > > > > There will be packet drop with kernel param "swiotlb = force" on
> > > > > > > Emulex 10Gb NIC using be2net driver. The problem is caused by
> > > > > > > receiving skb without calling pci_unmap_page() in get_rx_page_info().
> > > > > > > rx_page_info->last_page_user is initialized to false in
> > > > > > > be_post_rx_frags() when current frag are mapped in the first half of
> > > > > > > the same page with another frag. But in that case with
> > > > > > > "swiotlb = force" param, data can not be copied into the page of
> > > > > > > rx_page_info without calling pci_unmap_page, so the data frag mapped
> > > > > > > in the first half of the page will be dropped.
> > > > > > >
> > > > > > > It can be solved by creating only a mapping relation between frag
> > > > > > > and page, and deleting rx_page_info->last_page_user to ensure
> > > > > > > calling pci_unmap_page when handling each receiving frag.
> > > > > >
> > > > > > This patch uses an entire page for each RX frag (whose default size is 2048).
> > > > > > Consequently, on platforms like ppc64 where the default PAGE_SIZE is 64K,
> > > > > > memory usage becomes very inefficient.
> > > > > >
> > > > > > Instead, I've tried a partial-page mapping scheme. This retains the
> > > > > > page sharing logic, but un-maps each frag separately so that
> > > > > > the data is copied from the bounce buffers.
> > > > > [...]
> > > > >
> > > > > You don't need to map/unmap each fragment separately; you can sync a
> > > > > sub-page range with dma_sync_single_for_cpu().
> > > > >
> > > >
> > > > Ben, after syncing each frag with a dma_sync_single_for_cpu() call,
> > > > would I still need to dma_unmap_page() after DMA is done on all the frags
> > > > of the page? I thought I needed to.
> > >
> > > Yes.
> > >
> > > > I'm confused to see that swiotlb_bounce() (that copies data from the bounce buffers)
> > > > is called from both swiotlb_sync_single_for_cpu() and swiotlb_unmap_page().
> > > > I ofcourse don't want the data to be copied from the bounce buffers twice!
> > >
> > > Right, unmap includes a sync so you should:
> > > - for the last frag per page, dma_unmap_page() only
> > > - for every other frag, dma_sync_single_for_cpu() only
> >
> > Ben, the dma_unmap_page() requires us to pass the same dma_addr and size returned
> by
> > (and passed to) the dma_map_page() call.
> > So, for the last frag, when I call dma_map_page() with the page_addr (and not the
> frag_addr) and the
> > page size, won't it try to copy the data belonging to all the previous frags of page again?!!
> 
> When using bounce buffers (e.g. swiotlb), yes this results in another
> copy, which is inefficient.  But when using a real IOMMU neither
> function will copy.  I would optimise for IOMMUs.

Ok, that sounds right.....thanks for the help!
-Sathya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ