lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230307194801.mopwvidrkrybm7h5@kashmir.localdomain>
Date:   Tue, 7 Mar 2023 12:48:01 -0700
From:   Daniel Xu <dxu@...uu.xyz>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     bpf <bpf@...r.kernel.org>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, pablo@...filter.org,
        kadlec@...filter.org, fw@...len.de, daniel@...earbox.net
Subject: Re: [PATCH bpf-next v2 0/8] Support defragmenting IPv(4|6) packets
 in BPF

Hi Alexei,

(cc netfilter maintainers)

On Mon, Mar 06, 2023 at 08:17:20PM -0800, Alexei Starovoitov wrote:
> On Tue, Feb 28, 2023 at 3:17 PM Daniel Xu <dxu@...uu.xyz> wrote:
> >
> > > Have you considered to skb redirect to another netdev that does ip defrag?
> > > Like macvlan does it under some conditions. This can be generalized.
> >
> > I had not considered that yet. Are you suggesting adding a new
> > passthrough netdev thing that'll defrags? I looked at the macvlan driver
> > and it looks like it defrags to handle some multicast corner case.
> 
> Something like that. A netdev that bpf prog can redirect too.
> It will consume ip frags and eventually will produce reassembled skb.
> 
> The kernel ip_defrag logic has timeouts, counters, rhashtable
> with thresholds, etc. All of them are per netns.
> Just another ip_defrag_user will still share rhashtable
> with its limits. The kernel can even do icmp_send().
> ip_defrag is not a kfunc. It's a big block with plenty of kernel
> wide side effects.
> I really don't think we can alloc_skb, copy_skb, and ip_defrag it.
> It messes with the stack too much.
> It's also not clear to me when skb is reassembled and how bpf sees it.
> "redirect into reassembling netdev" and attaching bpf prog to consume
> that skb is much cleaner imo.
> May be there are other ways to use ip_defrag, but certainly not like
> synchronous api helper.

I was giving the virtual netdev idea some thought this morning and I
thought I'd give the netfilter approach a deeper look.

>From my reading (I'll run some tests later) it looks like netfilter
will defrag all ipv4/ipv6 packets in any netns with conntrack enabled.
It appears to do so in NF_INET_PRE_ROUTING.

Unfortunately that does run after tc hooks. But fortunately with the
new BPF netfilter hooks I think we can make defrag work outside of BPF
kfuncs like you want. And the NF_IP_FORWARD hook works well for my
router use case.

One thing we would need though are (probably kfunc) wrappers around
nf_defrag_ipv4_enable() and nf_defrag_ipv6_enable() to ensure BPF progs
are not transitively depending on defrag support from other netfilter
modules.

The exact mechanism would probably need some thinking, as the above
functions kinda rely on module_init() and module_exit() semantics. We
cannot make the prog bump the refcnt every time it runs -- it would
overflow.  And it would be nice to automatically free the refcnt when
prog is unloaded. 

Once the netfilter prog type series lands I can get that discussion
started. Unless Daniel feels strongly that we should continue with
the approach in this patchset, I am leaning towards dropping in favor
of netfilter approach.

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ