lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 31 Jan 2023 11:01:57 -0800
From:   Stanislav Fomichev <sdf@...gle.com>
To:     Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc:     Toke Høiland-Jørgensen <toke@...hat.com>,
        brouer@...hat.com, bpf@...r.kernel.org, netdev@...r.kernel.org,
        martin.lau@...nel.org, ast@...nel.org, daniel@...earbox.net,
        andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org,
        yhs@...com, john.fastabend@...il.com, dsahern@...il.com,
        willemb@...gle.com, void@...ifault.com, kuba@...nel.org,
        xdp-hints@...-project.net
Subject: Re: [xdp-hints] [PATCH bpf-next RFC V1] selftests/bpf:
 xdp_hw_metadata clear metadata when -EOPNOTSUPP

On Tue, Jan 31, 2023 at 5:00 AM Jesper Dangaard Brouer
<jbrouer@...hat.com> wrote:
>
>
>
> On 27/01/2023 18.18, Stanislav Fomichev wrote:
> > On Fri, Jan 27, 2023 at 5:58 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
> >>
> >> Jesper Dangaard Brouer <brouer@...hat.com> writes:
> >>
> >>> The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of
> >>> the availability of rx_timestamp and rx_hash in data_meta area. The
> >>> kernel-side BPF-prog code doesn't initialize these members when kernel
> >>> returns an error e.g. -EOPNOTSUPP.  This memory area is not guaranteed to
> >>> be zeroed, and can contain garbage/previous values, which will be read
> >>> and interpreted by AF_XDP userspace side.
> >>>
> >>> Tested this on different drivers. The experiences are that for most
> >>> packets they will have zeroed this data_meta area, but occasionally it
> >>> will contain garbage data.
> >>>
> >>> Example of failure tested on ixgbe:
> >>>   poll: 1 (0)
> >>>   xsk_ring_cons__peek: 1
> >>>   0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
> >>>   rx_hash: 3697961069
> >>>   rx_timestamp:  9024981991734834796 (sec:9024981991.7348)
> >>>   0x18ec788: complete idx=8 addr=8000
> >>>
> >>> Converting to date:
> >>>   date -d @9024981991
> >>>   2255-12-28T20:26:31 CET
> >>>
> >>> I choose a simple fix in this patch. When kfunc fails or isn't supported
> >>> assign zero to the corresponding struct meta value.
> >>>
> >>> It's up to the individual BPF-programmer to do something smarter e.g.
> >>> that fits their use-case, like getting a software timestamp and marking
> >>> a flag that gives the type of timestamp.
> >>>
> >>> Another possibility is for the behavior of kfunc's
> >>> bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require
> >>> clearing return value pointer.
> >>
> >> I definitely think we should leave it up to the BPF programmer to react
> >> to failures; that's what the return code is there for, after all :)
> >
> > +1
>
> +1 I agree.
> We should keep this default functions as simple as possible, for future
> "unroll" of BPF-bytecode.
>
> I the -EOPNOTSUPP case (default functions for drivers not implementing
> kfunc), will likely be used runtime by BPF-prog to determine if the
> hardware have this offload hint, but it comes with the overhead of a
> function pointer call.
>
> I hope we can somehow BPF-bytecode "unroll" these (default functions) at
> BPF-load time, to remove this overhead, and perhaps even let BPF
> bytecode do const propagation and code elimination?
>
>
> > Maybe we can unconditionally memset(meta, sizeof(*meta), 0) in
> > tools/testing/selftests/bpf/progs/xdp_hw_metadata.c?
> > Since it's not a performance tool, it should be ok functionality-wise.
>
> I know this isn't a performance test, but IMHO always memsetting
> metadata area is a misleading example.  We know from experience that
> developer simply copy-paste code examples, even quick-n-dirty testing
> example code.
>
> The specific issue in this example can lead to hard-to-find bugs, as my
> testing shows it is only occasionally that data_meta area contains
> garbage. We could do a memset, but it deserves a large code comment, why
> this is needed, so people copy-pasting understand. I choose current
> approach to keep code close to code people will copy-paste.

SG, I don't think it matters, but agreed that having this stated
explicitly could help with a blind copy-paste :-)
Then maybe repost with the TODO's removed from the kfucs? We seem to
agree that it's the user's job to manage the final buffer..

> --Jesper
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ