[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1289987467.2687.15.camel@edumazet-laptop>
Date: Wed, 17 Nov 2010 10:51:07 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Joakim Tjernlund <joakim.tjernlund@...nsmode.se>
Cc: Thomas Graf <tgraf@...radead.org>, netdev@...r.kernel.org
Subject: Re: ping -I eth1 ....
Le mercredi 17 novembre 2010 à 10:29 +0100, Joakim Tjernlund a écrit :
> Joakim Tjernlund/Transmode wrote on 2010/11/09 20:33:37:
> >
> > Joakim Tjernlund/Transmode wrote on 2010/11/06 10:42:46:
> > > Thomas Graf <tgr@...radead.org> wrote on 2010/11/05 21:31:50:
> > > >
> > > > On Fri, Nov 05, 2010 at 04:54:18PM +0100, Joakim Tjernlund wrote:
> > > > > Eric Dumazet <eric.dumazet@...il.com> wrote on 2010/11/05 16:06:54:
> > > > > >
> > > > > > > Hopefully most of that is legacy or just plain wrong? Unless
> > > > > > > someone can say why only test IFF_UP one should consider changing them.
> > > > > > >
> > > > > >
> > > > > > Most of the places are hot path.
> > > > > >
> > > > > > You dont want to replace one test by four tests.
> > > > > >
> > > > > > _This_ would be wrong :)
> > > > >
> > > > > Wrong is wrong, even if it is in the hot path :)
> > > > > Perhaps it is time define and internal IFF_OPERATIONAL flag
> > > > > which is the sum of IFF_UP, IFF_RUNNING etc.? Tht
> > > > > way you still get one test in the hot path and can abstract
> > > > > what defines an operational link.
> > > >
> > > > You definitely don't want to have your send() call fail simply because
> > > > the carrier was off for a few msec or the routing daemon has put a link
> > > > down temporarly. Also, the outgoing interface looked up at routing
> > > > decision is not necessarly the interface used for sending in the end.
> > > > The packet may get mangled and rerouted by netfilter or tc on the way.
> > >
> > > But do you handle the case when the link is non operational for a long time?
> > >
> > > >
> > > > Personally I'm even ok with the current behaviour of sendto() while the
> > > > socket is bound to an interface but if we choose to return an error
> > > > if the interface is down we might as well do so based on the operational
> > > > status.
>
> > > Perhaps there is a better way. This all started when pppd hung because
> > > of ping -I <ppp interface>, then someone pulled the cable for the on the link.
> > >
> > > This is a strace where we have two ping -I,
> > > ping -I p1-2-1-2-2 .. and ping -I p1-2-3-2-4 ..
> > > Notice how pppd hangs for a long time in PPPIOCDETACH
> > > As far as I can tell this is due to ping -I has claimed the ppp interfaces
> > > and doesn't noticed that the link is down. Ideally ping should receive
> > > a ENODEV as soon as pppd calls PPPIOCDETACH.
> > >
> > > 0.000908 write(0, "Connection terminated.\n", 23) = 23
> > > 0.000481 gettimeofday({1288952770, 566048}, NULL) = 0
> > > 0.001553 ioctl(7, PPPIOCDETACH
> > > Message from syslogd@...zil at Fri Nov 5 11:26:20 2010 ...
> > > Brazil kernel: unregister_netdevice: waiting for p1-2-1-2-2 to become free. Usage count = 3
> > > Message from syslogd@...zil at Fri Nov 5 11:26:20 2010 ...
> > > Brazil kernel: unregister_netdevice: waiting for p1-2-3-2-4 to become free. Usage count = 3
> > > Message from syslogd@...zil at Fri Nov 5 11:26:51 2010 ...
> > > Brazil last message repeated 3 times
> > > , 0xbfbc3398) = 0
> > > 66.559216 connect(9, {sa_family=AF_PPPOX, sa_data="\0\0\0\0\0\0\0\252\273\314\335\356hd"}, 30) = 0
> > > 0.000693 close(10) = 0
> > > 0.000449 close(7) = 0
> > > 0.009801 close(9) = 0
> >
> > Any comment on this last strace? It is expected that ping -I should
> > hold pppd hostage?
> >
>
> Ping?
>
I thought I posted a patch, is there something else ?
Could you please test with latest net-next-2.6 and following patch ?
Thanks
[PATCH net-next-2.6] ipv4: dont create a route if device is down
ip_route_output_slow() should not create a route if device is down, so
that we report -ENETUNREACH error to users.
Reported-by: Nicolas Dichtel <nicolas.dichtel@...nd.com>
Reported-by: Joakim Tjernlund <joakim.tjernlund@...nsmode.se>
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
net/ipv4/route.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 66610ea..3cc4191 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2559,8 +2559,11 @@ static int ip_route_output_slow(struct net *net, struct rtable **rp,
goto out;
/* RACE: Check return value of inet_select_addr instead. */
- if (rcu_dereference(dev_out->ip_ptr) == NULL)
- goto out; /* Wrong error code */
+ if (!(dev_out->flags & IFF_UP) ||
+ rcu_dereference(dev_out->ip_ptr) == NULL) {
+ err = -ENETUNREACH;
+ goto out;
+ }
if (ipv4_is_local_multicast(oldflp->fl4_dst) ||
ipv4_is_lbcast(oldflp->fl4_dst)) {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists