[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1283919160.2634.662.camel@edumazet-laptop>
Date: Wed, 08 Sep 2010 06:12:40 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Krzysztof Olędzki <ole@....pl>
Cc: David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: 2.6.34: Problem with UDP traffic on lo + poll(?)
Le mardi 07 septembre 2010 à 23:51 +0200, Krzysztof Olędzki a écrit :
> On 2010-09-07 23:39, Eric Dumazet wrote:
> > Le mardi 07 septembre 2010 à 23:28 +0200, Krzysztof Olędzki a écrit :
> >
> >> With the above patch I'm no longer able to reproduce the problem. Thanks!
> >>
> >> Tested-by: Krzysztof Piotr Oledzki<ole@....pl>
> >>
> >
> > Thanks a lot !
> >
> >> BTW: why it takes so long to trigger this bug and it is only possible
> >> over a loopback interface?
> >
> > Its a bit tricky : You need at least 10 sockets linked in a particular
> > hash chain.
> >
> > To check this, you can :
> >
> > cat /proc/net/udp
> >
> > maybe you have many sockets on port 123 or 53 ?
>
> On one affected host I have 3+7 and on the other, also affacted one, I have 3+6:
>
> root@...a:~# egrep -cw '(53|123):' /proc/net/udp
> 10
> root@...a:~# egrep -w '(53|123):' /proc/net/udp
> 53: 3582A8C0:0035 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 6084654 2 ffff8800cc012700 0
> 53: 0100007F:0035 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 6084652 2 ffff8800cc010900 0
> 123: D683A8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4911 2 ffff88012de96400 0
> 123: 7B85A8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4910 2 ffff88012de96100 0
> 123: 8982A8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4909 2 ffff88012de95e00 0
> 123: 7B82A8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4908 2 ffff88012de95b00 0
> 123: 3582A8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4907 2 ffff88012de95800 0
> 123: 1F7EA8C0:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4906 2 ffff88012de95500 0
> 123: 0100007F:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4905 2 ffff88012de95200 0
> 123: 00000000:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4899 2 ffff88012de94c00 0
>
> But how 123 is related to 53?
>
I was mentioning 123 or 53, as probable suspects :)
When a socket is created, and connect() called, autobind() chooses a
source port X for this socket.
if ((X % udp_hash_size) == 123), socket is inserted in hash chain number
123.
Bug then triggers, because when a packet is received for this socket, we
find a slot with more than 10 sockets -> Search is done on secondary
chain Z2, where we dont find the socket since its rcv_addr changed after
we inserted it (in chain Y2). Packet is dropped (as seen in netstat -s)
> > And about loopback, I have no idea... I am pretty sure I can trigger the
> > bug with other interfaces.
>
> OK. Probably it is because my other hosts have only a single IP and only
> the problematic ones have both DNS server and multiple IP (many sockets).
>
Yes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists