[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <OF02A603A2.70D9639B-ONC12577DE.004C19D8-C12577DE.004D44FE@transmode.se>
Date: Wed, 17 Nov 2010 15:03:59 +0100
From: Joakim Tjernlund <joakim.tjernlund@...nsmode.se>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: netdev@...r.kernel.org, Thomas Graf <tgraf@...radead.org>
Subject: Re: ping -I eth1 ....
Eric Dumazet <eric.dumazet@...il.com> wrote on 2010/11/17 11:23:32:
>
> Le mercredi 17 novembre 2010 à 11:09 +0100, Joakim Tjernlund a écrit :
> > Eric Dumazet <eric.dumazet@...il.com> wrote on 2010/11/17 10:51:07:
> > >
> > > Le mercredi 17 novembre 2010 à 10:29 +0100, Joakim Tjernlund a écrit :
> > > > Joakim Tjernlund/Transmode wrote on 2010/11/09 20:33:37:
> > > > >
> > > > > Joakim Tjernlund/Transmode wrote on 2010/11/06 10:42:46:
> > > > > > Thomas Graf <tgr@...radead.org> wrote on 2010/11/05 21:31:50:
> > > > > > >
> > > > > > > On Fri, Nov 05, 2010 at 04:54:18PM +0100, Joakim Tjernlund wrote:
> > > > > > > > Eric Dumazet <eric.dumazet@...il.com> wrote on 2010/11/05 16:06:54:
> > > > > > > > >
> > > > > > > > > > Hopefully most of that is legacy or just plain wrong? Unless
> > > > > > > > > > someone can say why only test IFF_UP one should consider changing them.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Most of the places are hot path.
> > > > > > > > >
> > > > > > > > > You dont want to replace one test by four tests.
> > > > > > > > >
> > > > > > > > > _This_ would be wrong :)
> > > > > > > >
> > > > > > > > Wrong is wrong, even if it is in the hot path :)
> > > > > > > > Perhaps it is time define and internal IFF_OPERATIONAL flag
> > > > > > > > which is the sum of IFF_UP, IFF_RUNNING etc.? Tht
> > > > > > > > way you still get one test in the hot path and can abstract
> > > > > > > > what defines an operational link.
> > > > > > >
> > > > > > > You definitely don't want to have your send() call fail simply because
> > > > > > > the carrier was off for a few msec or the routing daemon has put a link
> > > > > > > down temporarly. Also, the outgoing interface looked up at routing
> > > > > > > decision is not necessarly the interface used for sending in the end.
> > > > > > > The packet may get mangled and rerouted by netfilter or tc on the way.
> > > > > >
> > > > > > But do you handle the case when the link is non operational for a long time?
> > > > > >
> > > > > > >
> > > > > > > Personally I'm even ok with the current behaviour of sendto() while the
> > > > > > > socket is bound to an interface but if we choose to return an error
> > > > > > > if the interface is down we might as well do so based on the operational
> > > > > > > status.
> > > >
> > > > > > Perhaps there is a better way. This all started when pppd hung because
> > > > > > of ping -I <ppp interface>, then someone pulled the cable for the on the link.
> > > > > >
> > > > > > This is a strace where we have two ping -I,
> > > > > > ping -I p1-2-1-2-2 .. and ping -I p1-2-3-2-4 ..
> > > > > > Notice how pppd hangs for a long time in PPPIOCDETACH
> > > > > > As far as I can tell this is due to ping -I has claimed the ppp interfaces
> > > > > > and doesn't noticed that the link is down. Ideally ping should receive
> > > > > > a ENODEV as soon as pppd calls PPPIOCDETACH.
> > > > > >
> > > > > > 0.000908 write(0, "Connection terminated.\n", 23) = 23
> > > > > > 0.000481 gettimeofday({1288952770, 566048}, NULL) = 0
> > > > > > 0.001553 ioctl(7, PPPIOCDETACH
> > > > > > Message from syslogd@...zil at Fri Nov 5 11:26:20 2010 ...
> > > > > > Brazil kernel: unregister_netdevice: waiting for p1-2-1-2-2 to become free. Usage count = 3
> > > > > > Message from syslogd@...zil at Fri Nov 5 11:26:20 2010 ...
> > > > > > Brazil kernel: unregister_netdevice: waiting for p1-2-3-2-4 to become free. Usage count = 3
> > > > > > Message from syslogd@...zil at Fri Nov 5 11:26:51 2010 ...
> > > > > > Brazil last message repeated 3 times
> > > > > > , 0xbfbc3398) = 0
> > > > > > 66.559216 connect(9, {sa_family=AF_PPPOX, sa_data="\0\0\0\0\0\0\0\252\273\314\335\356hd"}, 30) = 0
> > > > > > 0.000693 close(10) = 0
> > > > > > 0.000449 close(7) = 0
> > > > > > 0.009801 close(9) = 0
> > > > >
> > > > > Any comment on this last strace? It is expected that ping -I should
> > > > > hold pppd hostage?
> > > > >
> > > >
> > > > Ping?
> > > >
> > >
> > > I thought I posted a patch, is there something else ?
> >
> > yes, I wondered about the above strace and if it is expected that ping -I
> > should hold pppd hostage? Should not ping receive a ENODEV as soon as
> > pppd detaches an interface?
> >
> > >
> > > Could you please test with latest net-next-2.6 and following patch ?
> >
> > I tested the first patch you sent and that one worked well. I can try
> > again on 2.6.35( our boards takes a while to move forward)?
>
> Well, in this case, apply commit :
>
> 332dd96f7ac15e937088fe11f15cfe0210e8edd1
>
> (net/dst: dst_dev_event() called after other notifiers)
>
> http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=332dd96f7ac15e937088fe11f15cfe0210e8edd1
hmm, with (or without) this and/or your other patch I get:
vconfig add eth0 1
ifconfig eth0.1 up
ping -I eth0.1
<in other session do>
vconfig rem eth0.1
Now vconfig rem eth0.1 hangs and I get several:
localhost kernel: unregister_netdevice: waiting for eth0.1 to become free. Usage count = 3
After some time vconfig rem eth0.1 succeeds
hmm, last test I did is still stuck, vconfig rem eth0.1 still hangs after 5 minutes.
The ping -I dies right after vconfig rem eth0.1
kernel 2.6.35
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists