[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1fdf938e-600c-ce58-ece1-06b816ddba1c@stressinduktion.org>
Date:   Thu, 24 Nov 2016 13:28:23 +0100
From:   Hannes Frederic Sowa <hannes@...essinduktion.org>
To:     YueHaibing <yuehaibing@...wei.com>, Julian Anastasov <ja@....bg>,
        Eric Dumazet <eric.dumazet@...il.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org
Subject: Re: net/arp: ARP cache aging failed.
On 24.11.2016 10:06, YueHaibing wrote:
> On 2016/11/24 15:51, Julian Anastasov wrote:
>>
>> 	Hello,
>>
>> On Wed, 23 Nov 2016, Eric Dumazet wrote:
>>
>>> On Wed, 2016-11-23 at 15:37 +0100, Hannes Frederic Sowa wrote:
>>>
>>>> Irregardless about the question if bonding should keep the MAC address
>>>> alive, a MAC address can certainly change below a TCP connection.
>>>
>>> Of course ;)
>>>
> 
> I configured bonding fail_over_mac=1 ,so the bonding MAC always be the MAC
> address of the currently active slave.
> 
>>>>
>>>> dst_entry is 1:n to neigh_entry and as such we can end up confirming an
>>>> aging neighbor while sending a reply with dst->pending_confirm set while
>>>> the confirming packet actually came from a different neighbor.
>>>>
>>>> I agree with Julian, pending_confirm became useless in this way.
>>>
>>> Let's kill it then ;)
>>
>> 	It works for traffic via gateway. I now see that
>> we can even avoid write in dst_confirm:
>>
>> 	if (!dst->pending_confirm)
>> 		dst->pending_confirm = 1;
>>
>> 	because it is called by non-dup TCP ACKs.
>>
>> 	But for traffic to hosts on LAN we need different solution,
>> i.e. for cached dsts with rt_gateway = 0 (last entry below).
>>
>> rt_uses_gateway rt_gateway DST_NOCACHE Description
>> ====================================================================
>> 1               nh_gw      ANY         Traffic via gateway
>> 0               LAN_host   1           FLOWI_FLAG_KNOWN_NH (nexthop
>>                                        set by IPVS, hdrincl, xt_TEE)
>> 0               0          0           1 dst for many subnet hosts
>>
>> Regards
>>
>> --
>> Julian Anastasov <ja@....bg>
>>
>> .
>>
> 
> As above,Is there a plan to fix the problem ? Should we just not call dst_confirm
> when in the case rt->rt_uses_gateway/DST_NOCACHE?
I think some people are thinking about it already (me also ;) ).
But it is not easy to come up with a solution. First of all, we need to
look up the L2 address again in the neighbor cache and confirm the
appropriate neighbor. Secondly we should only do that for packets which
we can actually confirm (that means passing the TCP recv tests or some
other kind of confirmation besides simply spamming the box etc). Also it
needs to be fast.
Bye,
Hannes
Powered by blists - more mailing lists
 
