[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1009271503420.14117@router.home>
Date: Mon, 27 Sep 2010 15:20:54 -0500 (CDT)
From: Christoph Lameter <cl@...ux.com>
To: David Stevens <dlstevens@...ibm.com>
cc: David Miller <davem@...emloft.net>, linux-rdma@...r.kernel.org,
netdev@...r.kernel.org, netdev-owner@...r.kernel.org,
rda@...con.com
Subject: Re: igmp: Allow mininum interval specification for igmp timers.
On Mon, 27 Sep 2010, David Stevens wrote:
> > With bad luck this thing times out way too fast because the total of
> > all of the randomized intervals can end up being very small, and I
> > think we should fix that independently of the other issues hit by the
> > IB folks.
>
> I think I'm caught up on the discussion now. For IGMPv3, we
> would send all the reports always in < 2 secs, and the average would
> be < 1 sec, so I'm not sure any sort of tweaks we do to enforce a
> minimum randomized interval are compatible with IGMPv3 and still
> solve IB's problem.
Ok thanks for the effort but so far I do not see you having caught up. I'd
rather avoid responding to the misleading statements you made in other
replies and just respond to where you missed the boat here.
> As I said before, I think per protocol, back-to-back is both
> allowed and not a problem, even if both subsequent randomized reports
> come out to 0 time. But if we wanted to enforce a minimum interval
> of, say, X, then I think the better way to do that is to set the
> timer to X + rand(Interval-X) and not a table of fixed intervals
The second patch sets the intervals to X .. X + Rand (interval) and not to
a table of fixed intervals as you state here. I have pointed this out
before.
> as in the original patch. For v2, X=1 or 2 sec and Interval=10
> might work well, but for v3, the entire interval is 1 sec and I
> think I saw that the set-up time for the fabric may be on the
> order of 1 sec.
Again there is no knowledge about V2 or V3 without a query and this is
during the period when no querier is known yet. You stated elsewhere that
I can assume V3 by default? So 1 sec?
> I also don't think that we want those kinds of delays on
> Ethernet. A program may join and send lots of traffic in 1 sec,
> and if the immediate join is lost, one of the quickly-following
> <1 sec duplicate reports will make it recover and work. Delaying
> the minimum would guarantee it wouldn't work until that minimum
> and drop all that traffic if the immediate report is lost, then.
There can be any number of reasons that a short outage could prevent the
packets from going through. A buffer overrun (that you mentioned
elsewhere) usually causes lots of packets to be lost. Buffer overrun
scenarios usually mean that all igmp queries are lost.
> Really, of course, I think the solution belongs in IB, but
> if we did anything in IGMP, I'd prefer it were a per-interface
> tunable that defaults as in the RFC. Since you can change the
> interval and # of reports through a querier now, exporting the
> default values of (10,2) for v2 and (1,2) for v3 to instead be
> per-interface tunables and then bumped as needed for IB would
> allow tweaking without running a querier. But a querier that's
> using default values would also override that and cause the
> problem all over again. Queuing in the driver until the MAC
> address is usable solves it generally.
There is no solution on the IB layer since there is no notification when
the fabric reconfiguration necessary for an multicast group is complete.
The querier is of not use since (for the gazillionth of times) this is an
unsolicited IGMP report. If there is a querier then the unsolicited igmp
reports would not be used but the timeout indicated by the querier would
be used.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists