[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110509205626.19dede92@nehalam>
Date: Mon, 9 May 2011 20:56:26 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Greg Rose <gregory.v.rose@...el.com>, netdev@...r.kernel.org,
bhutchings@...arflare.com, davem@...emloft.net
Subject: Re: [RFC V2 PATCH] rtnetlink: Add method to calculate dump info
data size
On Tue, 10 May 2011 05:45:27 +0200
Eric Dumazet <eric.dumazet@...il.com> wrote:
> Le lundi 09 mai 2011 à 20:17 -0700, Stephen Hemminger a écrit :
> > On Tue, 10 May 2011 04:43:33 +0200
> > Eric Dumazet <eric.dumazet@...il.com> wrote:
> >
> > > Le lundi 09 mai 2011 à 15:26 -0700, Greg Rose a écrit :
> > > > The message size allocated for rtnl info dumps was limited to a single
> > > > page. This is not enough for additional interface info available with
> > > > devices that support SR-IOV. Calculate the amount of data required so
> > > > the dump can allocate enough data to satisfy the request.
> > > >
> > > > V2 of this patch adds a new argument to the rtnl_register service that
> > > > allows for a new method to calculate the amount of data required to
> > > > complete the info dump request. So far the method is only implemented
> > > > for the RTM_GETLINK slot.
> > > >
> > > > Signed-off-by: Greg Rose <gregory.v.rose@...el.com>
> > >
> > > >
> > > > +static u16 rtnl_calcit(struct sk_buff *skb)
> > > > +{
> > > > + struct net *net = sock_net(skb->sk);
> > > > + int h;
> > > > + int idx = 0, s_idx;
> > > > + struct net_device *dev;
> > > > + struct hlist_head *head;
> > > > + struct hlist_node *node;
> > > > + u16 alloc_size = 0;
> > > > +
> > > > + for (h = 0; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
> > > > + idx = 0;
> > > > + head = &net->dev_index_head[h];
> > > > + hlist_for_each_entry(dev, node, head, index_hlist) {
> > > > + if (idx < s_idx) {
> > > > + idx++;
> > > > + continue;
> > > > + }
> > > > + alloc_size = (u16)if_nlmsg_size(dev);
> > > > + break;
> > > > + }
> > > > + }
> > > > +
> > > > + return alloc_size;
> > > > +}
> > > > +
> > >
> > >
> > > Sorry this wont scale. Some machines have thousand of devices.
> > >
> > > Just make an upper approximation, you dont need an exact one ;)
> >
> > The route dump does scale, can't you use a similar logic?
> > The result doesn't come back as one huge allocation.
> > I regularly test 600K routes on small machines.
> >
>
> Not sure I understand you Stephen.
>
> In Greg patch, rtnl_calcit() would be called for every 4K/8K block "ip"
> gets from kernel.
>
> If you add a function to route dump that would scan the 600K routes to
> get the max route size, surely you notice O(N^2) complexity instead of
> O(N)
>
> We only need to maintain a global variable to hold min_dump_alloc
I was hoping that the new interface dump would not need a pre-calculated
size and could just incrementally add values. I was trying to use an
analogy with route dumping. The current route dump does not precalculate
size.
What happens is dump iterates over the table and puts entries into
skb. When space is exhausted in skb the iterator stops and records the
key of the where to restart. Then restarts with next skb from there.
This scales O(N) with number of routes and does not have to precompute
size.
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists