lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd965e88-b9fd-4d58-4272-07588b842252@stressinduktion.org>
Date:   Fri, 28 Apr 2017 21:43:36 +0200
From:   Hannes Frederic Sowa <hannes@...essinduktion.org>
To:     Alexei Starovoitov <ast@...com>,
        John Fastabend <john.fastabend@...il.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Andy Gospodarek <andy@...yhouse.net>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Daniel Borkmann <borkmann@...earbox.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "xdp-newbies@...r.kernel.org" <xdp-newbies@...r.kernel.org>
Subject: Re: xdp_redirect ifindex vs port. Was: best API for returning/setting
 egress port?

On 28.04.2017 07:30, Alexei Starovoitov wrote:
> On 4/27/17 10:06 PM, John Fastabend wrote:
>> That is more or less what I was thinking as well. The other question
>> I have though is should we have a bpf_redirect() call for the simple
>> case where I use the ifindex directly. This will be helpful for taking
>> existing programs from tc_cls into xdp. I think it makes sense to have
>> both bpf_tx_allports(), bpf_tx_port(), and bpf_redirect().
> 
> I think so too.
> Once netdevice is stored into netdev_array map the netdevice is pinned
> and we need to figure out what to do if somebody tries to delete it.
> Should we add a new netlink notifier that this netdev's refcnt is
> almost zero and it's only in netdev_array(s) ?

We basically do that automatically in netdev_wait_allrefs:

pr_emerg("unregister_netdevice: waiting for %s to become free. Usage
count = %d\n",
         dev->name, refcnt);

It is a very unpleasant warning and users probably think about a bug in
the kernel at first.

I don't think we should wait for user space to clean that up but have to
do it automatically from the kernel. Maybe we can introduce a special
value that basically NOPs the transmission. The hash table itself would
install a netdevice notifier and would clean all tables. Could
definitely cause some storm in the kernel, if a lot of keys are mapped
to the same interface.

> or should it be deleted from the array(s) automatically and
> then user space will be notified post-deletion?
> Both approaches have their pros and cons.

I am leaning more towards deleting it automatically. But walking all
tables and in there all keys might cause some unwanted load spikes.

> Whereas raw ifindex approach (via bpf_redirect) doesn't have these
> caveats. It's clear to both bpf prog and user space that ifindex
> can be stale and user space needs to monitor netdevs and update
> programs/maps.

A separate type for ifindex as key or value might be nice to expose this
information directly via the kernel (fdinfo etc.) but at the same time,
debugging infrastructure from user space can also easily deal with that.

Another approach would be:

ifindexes are allocated cyclic and also are signed int and not u32
during allocation. Maybe we can negate the ifindex during transmission
in the table and thus mark it as stale (or set it to -1)? This update
would be done by bpf_tx_*ports() that take a reference to a table and a
key and submit the packets on the appropriate ports and can flag the
relevant ifindexes as stale.

Just wanted to draft this idea, I am not particular happy with that
idea. Maybe someone comes up with a better one.

Thanks,
Hannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ