[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 6 Jun 2019 23:31:55 +0000
From: Martin Lau <kafai@...com>
To: Stefano Brivio <sbrivio@...hat.com>
CC: David Miller <davem@...emloft.net>, Jianlin Shi <jishi@...hat.com>,
"Wei Wang" <weiwan@...gle.com>, David Ahern <dsahern@...il.com>,
Eric Dumazet <edumazet@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net 1/2] ipv6: Dump route exceptions too in
rt6_dump_route()
On Fri, Jun 07, 2019 at 12:58:52AM +0200, Stefano Brivio wrote:
> On Thu, 6 Jun 2019 22:37:11 +0000
> Martin Lau <kafai@...com> wrote:
>
> > On Fri, Jun 07, 2019 at 12:17:47AM +0200, Stefano Brivio wrote:
> > > On Thu, 6 Jun 2019 21:44:58 +0000
> > > Martin Lau <kafai@...com> wrote:
> > >
> > > > > + if (!(filter->flags & RTM_F_CLONED)) {
> > > > > + err = rt6_fill_node(net, arg->skb, rt, NULL, NULL, NULL, 0,
> > > > > + RTM_NEWROUTE,
> > > > > + NETLINK_CB(arg->cb->skb).portid,
> > > > > + arg->cb->nlh->nlmsg_seq, flags);
> > > > > + if (err)
> > > > > + return err;
> > > > > + } else {
> > > > > + flags |= NLM_F_DUMP_FILTERED;
> > > > > + }
> > > > > +
> > > > > + bucket = rcu_dereference(rt->rt6i_exception_bucket);
> > > > > + if (!bucket)
> > > > > + return 0;
> > > > > +
> > > > > + for (i = 0; i < FIB6_EXCEPTION_BUCKET_SIZE; i++) {
> > > > > + hlist_for_each_entry(rt6_ex, &bucket->chain, hlist) {
> > > > > + if (rt6_check_expired(rt6_ex->rt6i))
> > > > > + continue;
> > > > > +
> > > > > + err = rt6_fill_node(net, arg->skb, rt,
> > > > > + &rt6_ex->rt6i->dst,
> > > > > + NULL, NULL, 0, RTM_NEWROUTE,
> > > > > + NETLINK_CB(arg->cb->skb).portid,
> > > > > + arg->cb->nlh->nlmsg_seq, flags);
> > > > Thanks for the patch.
> > > >
> > > > A question on when rt6_fill_node() returns -EMSGSIZE while dumping the
> > > > exception bucket here. Where will the next inet6_dump_fib() start?
> > >
> > > And thanks for reviewing.
> > >
> > > It starts again from the same node, see fib6_dump_node(): w->leaf = rt;
> > > where rt is the fib6_info where we failed dumping, so we won't skip
> > > dumping any node.
> > If the same node will be dumped, does it mean that it will go through this
> > loop and iterate all exceptions again?
>
> Yes (well, all the exceptions for that node).
>
> > > This also means that to avoid sending duplicates in the case where at
> > > least one rt6_fill_node() call goes through and one fails, we would
> > > need to track the last bucket and entry sent, or, alternatively, to
> > > make sure we can fit the whole node before dumping.
> > My another concern is the dump may never finish.
>
> That's not a guarantee in general, even without this, because in theory
> the skb passed might be small enough that we can't even fit a single
> node without exceptions.
That is arguably the caller's responsibility to retry
with a larger buffer if it cannot even get a single route.
If caller provides a large enough buffer for a single route,
the kernel should guarantee forward progress.
I think the minimum is to remember how many exceptions have to be
skipped.
>
> We could add a guard on w->leaf not being the same before and after the
> walk in inet6_dump_fib() and, if it is, terminate the dump. I just
> wonder if we have to do this at all -- I can't find this being done
> anywhere else (at a quick look at least).
>
> By the way, we can also trigger a never-ending dump by touching the
> tree frequently enough during a dump: it would always start again from
> the root, see fib6_dump_table().
This case "cb->args[5] != w->root->fn_sernum"? It seems there is a w->skip
to take care of it.
Regardless, I don't think we should make it worse.
Powered by blists - more mailing lists