[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180801170949.5bf6101e@redhat.com>
Date: Wed, 1 Aug 2018 17:09:49 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>,
Alexei Starovoitov <ast@...nel.org>
Cc: Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org,
Jakub Kicinski <jakub.kicinski@...ronome.com>,
John Fastabend <john.fastabend@...il.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
Björn Töpel <bjorn.topel@...el.com>,
brouer@...hat.com
Subject: Re: [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi
ring
On Wed, 1 Aug 2018 14:41:08 +0900
Toshiaki Makita <makita.toshiaki@....ntt.co.jp> wrote:
> On 2018/07/31 21:46, Jesper Dangaard Brouer wrote:
> > On Tue, 31 Jul 2018 19:40:08 +0900
> > Toshiaki Makita <makita.toshiaki@....ntt.co.jp> wrote:
> >
> >> On 2018/07/31 19:26, Jesper Dangaard Brouer wrote:
> >>>
> >>> Context needed from: [PATCH v6 bpf-next 2/9] veth: Add driver XDP
> >>>
> >>> On Mon, 30 Jul 2018 19:43:44 +0900
> >>> Toshiaki Makita <makita.toshiaki@....ntt.co.jp> wrote:
> >>>
[...]
> >>>
> >>> Here you are adding an assumption that struct xdp_frame is always
> >>> located in-the-top of the packet-data area. I tried hard not to add
> >>> such a dependency! You can calculate the beginning of the frame from
> >>> the xdp_frame->data pointer.
> >>>
> >>> Why not add such a dependency? Because for AF_XDP zero-copy, we cannot
> >>> make such an assumption.
> >>>
> >>> Currently, when an RX-queue is in AF-XDP-ZC mode (MEM_TYPE_ZERO_COPY)
> >>> the packet will get dropped when calling convert_to_xdp_frame(), but as
> >>> the TODO comment indicated in convert_to_xdp_frame() this is not the
> >>> end-goal.
> >>>
> >>> The comment in convert_to_xdp_frame(), indicate we need a full
> >>> alloc+copy, but that is actually not necessary, if we can just use
> >>> another memory area for struct xdp_frame, and a pointer to data. Thus,
> >>> allowing devmap-redir to work-ZC and allow cpumap-redir to do the copy
> >>> on the remote CPU.
> >>
> >> Thanks for pointing this out.
> >> Seems you are saying xdp_frame area is not reusable. That means we
> >> reduce usable headroom on every REDIRECT. I wanted to avoid this but
> >> actually it is impossible, right?
> >
> > I'm not sure I understand fully... has this something to do, with the
> > below memset?
>
> Sorry for not being so clear...
> It has something to do with the memset as well but mainly I was talking
> about XDP_TX and REDIRECT introduced in patch 8. On REDIRECT,
> dev_map_enqueue() calls convert_to_xdp_frame() so we use the headroom
> for struct xdp_frame on REDIRECT. If we don't reuse xdp_frame region of
> the original xdp packet, we reduce the headroom size each time on
> REDIRECT. When ZC is used, in the future xdp_frame can be non-contiguous
> to the buffer, so we cannot reuse the xdp_frame region in
> convert_to_xdp_frame()? But current convert_to_xdp_frame()
> implementation requires xdp_frame region in headroom so I think I cannot
> avoid this dependency now.
>
> SKB has a similar problem if we cannot reuse it. It can be passed to a
> bridge and redirected to another veth which has driver XDP. In that case
> we need to reallocate the page if we have reduced the headroom because
> sufficient headroom is required for XDP processing for now (can we
> remove this requirement actually?).
Okay, now I understand. Your changes allow multiple levels of
XDP_REDIRECT between/into other veth net_devices. This is very
interesting and exciting stuff, but also a bit scary, when thinking
about if we got he life-time correct for the different memory objects.
You have convinced me. We should not sacrifice/reduce the headroom
this way. I'll also fix up cpumap.
To avoid the performance penalty of the memset, I propose that we just
clear the xdp_frame->data pointer. But lets implement it via a common
sanitize/scrub function.
> > When cpumap generate an SKB for the netstack, then we sacrifice/reduce
> > the SKB headroom available, by in convert_to_xdp_frame() reducing the
> > headroom by xdp_frame size.
> >
> > xdp_frame->headroom = headroom - sizeof(*xdp_frame)
> >
> > In-order to avoid doing such memset of this area. We are actually only
> > worried about exposing the 'data' pointer, thus we could just clear
> > that. (See commit 6dfb970d3dbd, this is because Alexei is planing to
> > move from CAP_SYS_ADMIN to lesser privileged mode CAP_NET_ADMIN)
> >
> > See commits:
> > 97e19cce05e5 ("bpf: reserve xdp_frame size in xdp headroom")
> > 6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")
>
> We have talked about that...
> https://patchwork.ozlabs.org/patch/903536/
>
> The memset is introduced as per your feedback, but I'm still not sure if
> we need this. In general the headroom is not cleared after allocation in
> drivers, so anyway unprivileged users should not see it no matter if it
> contains xdp_frame or not...
I actually got this request from Alexei. That is why I implemented it.
Personally I don't think this clearing is really needed, until someone
actually makes the TC/cls_act BPF hook CAP_NET_ADMIN.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists