netdev - Re: [Intel-wired-lan] [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0UfEh8cvTht3yceyXqwReJOQkcpJV8j0vHSJwookTWhn_Q@mail.gmail.com>
Date:   Wed, 7 Aug 2019 09:06:12 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Maciej Fijalkowski <maciejromanfijalkowski@...il.com>
Cc:     Firo Yang <firo.yang@...e.com>, Netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        Jacob Wen <jian.w.wen@...cle.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Alexander Duyck <alexander.h.duyck@...ux.intel.com>
Subject: Re: [Intel-wired-lan] [PATCH v2 1/1] ixgbe: sync the first fragment unconditionally

On Wed, Aug 7, 2019 at 7:09 AM Maciej Fijalkowski
<maciejromanfijalkowski@...il.com> wrote:
>
> On Wed, 7 Aug 2019 08:38:43 +0000
> Firo Yang <firo.yang@...e.com> wrote:
>
> > The 08/07/2019 15:56, Jacob Wen wrote:
> > > I think the description is not correct. Consider using something like below.
> > Thank you for comments.
> >
> > >
> > > In Xen environment, due to memory fragmentation ixgbe may allocate a 'DMA'
> > > buffer with pages that are not physically contiguous.
> > Actually, I didn't look into the reason why ixgbe got a DMA buffer which
> > was mapped to Xen-swiotlb area.
>
> I think that neither of these descriptions are telling us what was truly
> broken. They lack what Alexander wrote on v1 thread of this patch.
>
> ixgbe_dma_sync_frag is called only on case when the current descriptor has EOP
> bit set, skb was already allocated and you'll be adding a current buffer as a
> frag. DMA unmapping for the first frag was intentionally skipped and we will be
> unmapping it here, in ixgbe_dma_sync_frag. As Alex said, we're using the
> DMA_ATTR_SKIP_CPU_SYNC attribute which obliges us to perform a sync manually
> and it was missing.
>
> So IMHO the commit description should include descriptions from both xen and
> ixgbe worlds, the v2 lacks info about ixgbe case.
>
> BTW Alex, what was the initial reason for holding off with unmapping the first
> buffer? Asking because IIRC the i40e and ice behave a bit different here. We
> don't look there for EOP at all when building/constructing skb and not delaying
> the unmap of non-eop buffers.
>
> Thanks,
> Maciej

The reason why we have to hold off on unmapping the first buffer is
because in the case of Receive Side Coalescing (RSC), also known as Large
Receive Offload (LRO), the header of the packet is updated for each
additional frame that is added. As such you can end up with the device
writing data, header, data, header, data, header where each data write
would update a new descriptor, but the header will only ever update the
first descriptor in the chain. As such if we unmapped it earlier it would
result in an IOMMU fault because the device would be rewriting the header
after it had been unmapped.

The devices supported by the ixgbe driver are the only ones that have
RSC/LRO support. As such this behavior is present for ixgbe, but not for
other Intel NIC drivers including igb, igbvf, ixgbevf, i40e, etc.

- Alex