lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200727171642.l6pmwcsxhskws3gv@bsd-mbp.dhcp.thefacebook.com>
Date:   Mon, 27 Jul 2020 10:16:42 -0700
From:   Jonathan Lemon <jonathan.lemon@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     netdev <netdev@...r.kernel.org>, kernel-team <kernel-team@...com>,
        Christoph Hellwig <hch@....de>,
        Robin Murphy <robin.murphy@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Willem de Bruijn <willemb@...gle.com>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Maxim Mikityanskiy <maximmi@...lanox.com>,
        bjorn.topel@...el.com, magnus.karlsson@...el.com,
        borisp@...lanox.com, david@...hat.com
Subject: Re: [RFC PATCH v2 08/21] skbuff: add a zc_netgpu bitflag

On Mon, Jul 27, 2020 at 10:08:10AM -0700, Eric Dumazet wrote:
> On Mon, Jul 27, 2020 at 10:01 AM Jonathan Lemon
> <jonathan.lemon@...il.com> wrote:
> >
> > On Mon, Jul 27, 2020 at 08:24:55AM -0700, Eric Dumazet wrote:
> > > On Mon, Jul 27, 2020 at 12:20 AM Jonathan Lemon
> > > <jonathan.lemon@...il.com> wrote:
> > > >
> > > > This could likely be moved elsewhere.  The presence of the flag on
> > > > the skb indicates that one of the fragments may contain zerocopy
> > > > RX data, where the data is not accessible to the cpu.
> > >
> > > Why do we need yet another flag in skb exactly ?
> > >
> > > Please define what means "data not accessible to the cpu" ?
> > >
> > > This kind of change is a red flag for me.
> >
> > The architecture this is targeting is a ML cluster, where a 200Gbps NIC
> > is attached to a PCIe switch which also has a GPU card attached.  There
> > are several of these, and the link(s) to the host cpu (which has another
> > NIC attached) can't handle the incoming traffic.
> >
> > So what we're doing here is transferring the data directly from the NIC
> > to the GPU via DMA.  The host never sees the data, but can control it
> > indirectly via the handles returned to userspace.
> >
> 
> This seems to need a page/memory attribute or something.

'struct page' is even more constrained.  I opted to flag the skb since
there should not be mixed zc/non-zc pages in the same skb.  I'd be happy
to see other alternatives.


> skb should not have this knowledge, unless you are planning to make
> sure that everything accessing skb data is going to test this new flag
> and fail if it is set ?

That might be how things end up going - in the samw way skb_zcopy()
works.
--
Jonathan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ