[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMArcTXvccYBPZTEuW-z=uTK7W67utd9-xjPzfxEOvUWhPS7bg@mail.gmail.com>
Date: Wed, 21 Aug 2024 18:14:50 +0900
From: Taehee Yoo <ap420073@...il.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-alpha@...r.kernel.org,
linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org,
sparclinux@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
linux-arch@...r.kernel.org, linux-kselftest@...r.kernel.org,
bpf@...r.kernel.org, linux-media@...r.kernel.org,
dri-devel@...ts.freedesktop.org, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Donald Hunter <donald.hunter@...il.com>, Jonathan Corbet <corbet@....net>,
Richard Henderson <richard.henderson@...aro.org>, Ivan Kokshaysky <ink@...assic.park.msu.ru>,
Matt Turner <mattst88@...il.com>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>, Helge Deller <deller@....de>,
Andreas Larsson <andreas@...sler.com>, Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Arnd Bergmann <arnd@...db.de>, Steffen Klassert <steffen.klassert@...unet.com>,
Herbert Xu <herbert@...dor.apana.org.au>, David Ahern <dsahern@...nel.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>, Shuah Khan <shuah@...nel.org>,
Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>, Sumit Semwal <sumit.semwal@...aro.org>,
Christian König <christian.koenig@....com>,
Bagas Sanjaya <bagasdotme@...il.com>, Christoph Hellwig <hch@...radead.org>,
Nikolay Aleksandrov <razor@...ckwall.org>, Pavel Begunkov <asml.silence@...il.com>, David Wei <dw@...idwei.uk>,
Jason Gunthorpe <jgg@...pe.ca>, Yunsheng Lin <linyunsheng@...wei.com>,
Shailend Chand <shailend@...gle.com>, Harshitha Ramamurthy <hramamurthy@...gle.com>,
Shakeel Butt <shakeel.butt@...ux.dev>, Jeroen de Borst <jeroendb@...gle.com>,
Praveen Kaligineedi <pkaligineedi@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Kaiyuan Zhang <kaiyuanz@...gle.com>, Daniel Vetter <daniel.vetter@...ll.ch>
Subject: Re: [PATCH net-next v19 03/13] netdev: support binding dma-buf to netdevice
On Tue, Aug 20, 2024 at 1:01 PM Mina Almasry <almasrymina@...gle.com> wrote:
>
> On Mon, Aug 19, 2024 at 6:53 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Mon, 19 Aug 2024 00:44:27 +0900 Taehee Yoo wrote:
> > > > @@ -9537,6 +9540,10 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack
> > > > NL_SET_ERR_MSG(extack, "Native and generic XDP can't be active at the same time");
> > > > return -EEXIST;
> > > > }
> > > > + if (dev_get_max_mp_channel(dev) != -1) {
> > > > + NL_SET_ERR_MSG(extack, "XDP can't be installed on a netdev using memory providers");
> > > > + return -EINVAL;
> > > > + }
> > >
> > > Should we consider virtual interfaces like bonding, bridge, etc?
> > > Virtual interfaces as an upper interface of physical interfaces can
> > > still install XDP prog.
> > >
> > > # ip link add bond0 type bond
> > > # ip link set eth0 master bond0
> > > # ip link set bond0 xdp pin /sys/fs/bpf/x/y
> > > and
> > > # ip link set bond0 xdpgeneric pin /sys/fs/bpf/x/y
> > >
> > > All virtual interfaces can install generic XDP prog.
> > > The bonding interface can install native XDP prog.
> >
> > Good point. We may need some common helpers to place the checks for XDP.
> > They are spread all over the place now.
>
> Took a bit of a look here. Forgive me, I'm not that familiar with XDP
> and virtual interfaces, so I'm a bit unsure what to do here.
>
> For veth, it seems, the device behind the veth is stored in
> veth_priv->peer, so it seems maybe a dev_get_max_mp_channel() check on
> veth_priv->peer is the way to go to disable this for veth? I think we
> need to do this check on creation of the veth and on the ndo_bpf of
> veth.
>
> For bonding, it seems we need to add mp channel check in bond_xdp_set,
> and bond_enslave?
>
> There are a few other drivers that define ndo_add_slave, seems a check
> in br_add_slave is needed as well.
>
> This seems like a potentially deep rabbit hole with a few checks to
> add all of the place. Is this blocking the series? AFAICT if XDP fails
> with mp-bound queues with a benign error, that seems fine to me; I
> don't have a use case for memory providers + xdp yet. This should only
> be blocking if someone can repro a very serious error (kernel crash)
> or something with this combination.
>
> I can try to add these checks locally and propose as a follow up
> series. Let me know if I'm on the right track with figuring out how to
> implement this, and, if you feel like it's blocking.
>
> --
> Thanks,
> Mina
I agree with the current approach, which uses the
dev_get_min_mp_channel_count() in the dev_xdp_attach().
The only problem that I am concerned about is the
dev_get_min_mp_channel_count() can't check lower interfaces.
So, how about just making the current code to be able to check lower
interfaces?
Here is a rough modification and I tested it. it works well.
Please look into this code.
diff --git a/net/core/dev.c b/net/core/dev.c
index f6f40c682b83..87c7985cb242 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6989,6 +6989,27 @@ static __latent_entropy void
net_rx_action(struct softirq_action *h)
bpf_net_ctx_clear(bpf_net_ctx);
}
+static int __dev_get_min_mp_channel_count(struct net_device *dev,
+ struct netdev_nested_priv *priv)
+{
+ int i, max = 0;
+
+ ASSERT_RTNL();
+
+ for (i = 0; i < dev->real_num_rx_queues; i++)
+ if (dev->_rx[i].mp_params.mp_priv)
+ /* The channel count is the idx plus 1. */
+ max = i + 1;
+
+ return max;
+}
+
+u32 dev_get_min_mp_channel_count(const struct net_device *dev)
+{
+ return (u32)__dev_get_min_mp_channel_count((struct net_device *)dev,
+ NULL);
+}
+
struct netdev_adjacent {
struct net_device *dev;
netdevice_tracker dev_tracker;
@@ -9538,7 +9559,10 @@ static int dev_xdp_attach(struct net_device
*dev, struct netlink_ext_ack *extack
NL_SET_ERR_MSG(extack, "Native and generic XDP
can't be active at the same time");
return -EEXIST;
}
- if (dev_get_min_mp_channel_count(dev)) {
+
+ if (netdev_walk_all_lower_dev(dev,
+ __dev_get_min_mp_channel_count,
+ NULL)) {
NL_SET_ERR_MSG(extack, "XDP can't be installed
on a netdev using memory providers");
return -EINVAL;
}
@@ -9826,20 +9850,6 @@ int dev_change_xdp_fd(struct net_device *dev,
struct netlink_ext_ack *extack,
return err;
}
-u32 dev_get_min_mp_channel_count(const struct net_device *dev)
-{
- u32 i, max = 0;
-
- ASSERT_RTNL();
-
- for (i = 0; i < dev->real_num_rx_queues; i++)
- if (dev->_rx[i].mp_params.mp_priv)
- /* The channel count is the idx plus 1. */
- max = i + 1;
-
- return max;
-}
-
/**
* dev_index_reserve() - allocate an ifindex in a namespace
* @net: the applicable net namespace
How to test:
ip link add bond2 type bond
ip link add bond1 master bond2 type bond
ip link add bond0 master bond1 type bond
ip link set eth0 master bond0
ip link set eth0 up
ip link set bond0 up
ip link set bond1 up
ip link set bond2 up
ip link set bond2 xdp pin /sys/fs/bpf/x/y
./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f eth0 -l -p 5201 -v 7 -t 0 -q 1
# bond2 <-- xdp should not be installed.
# |
# bond1 <-- xdp should not be installed.
# |
# bond0 <-- xdp should not be installed.
# |
# eth0 <--memory provider is used.
The netdev_walk_all_lower_dev() calls the callback function
(__dev_get_min_mp_channel_count) while it walks its own all lower
interfaces recursively.
If we want to check more conditions, we can just add checks in
__dev_get_min_mp_channel_count() or change the callback function.
Note that currently dev_xdp_attach() checks upper interfaces with
netdev_for_each_upper_dev_rcu() but it doesn't work recursively.
I think It should be fixed to check upper interfaces recursively in a
separate patch.
Thanks a lot!
Taehee Yoo
Powered by blists - more mailing lists