[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d95e4afc-3a95-a3a7-edd9-1bd13cdec90@ssi.bg>
Date: Tue, 31 Jan 2023 18:13:50 +0200 (EET)
From: Julian Anastasov <ja@....bg>
To: Zhang Changzhong <zhangchangzhong@...wei.com>
cc: Network Development <netdev@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"Denis V. Lunev" <den@...nvz.org>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Daniel Borkmann <daniel@...earbox.net>,
YueHaibing <yuehaibing@...wei.com>
Subject: Re: [Question] neighbor entry doesn't switch to the STALE state
after the reachable timer expires
Hello,
On Tue, 31 Jan 2023, Zhang Changzhong wrote:
> Thanks for your fix!
>
> I reproduced this problem by directly modifying the confirmed time of
> the neigh entry in STALE state. Your change can fix the problem in this
> scenario.
I'm glad that you have a way to test it :) Thanks!
> Just curious, why did you choose 'jiffies - MAX_JIFFY_OFFSET + 86400 * HZ'
> as the value of 'mint'?
It is too arbitrary :) Probably, just 'jiffies - MAX_JIFFY_OFFSET'
is enough or something depending on HZ/USER_HZ. I added 1 day for
timer to advance without leaving confirmed time behind the
jiffies - MAX_JIFFY_OFFSET zone but it is not needed.
What limits play here:
- the HZ/USER_HZ difference: jiffies_to_clock_t reports the 3 times
to user space, so we want to display values as large as possible.
Any HZ > 100 for USER_HZ=100 works for the jiffies - MAX_JIFFY_OFFSET.
HZ=100 does not work.
- users can use large values for sysctl vars which can keep the timer
running for long time and reach some outdated confirmed time
before neigh_add_timer() is called to correct it
If we choose mint = jiffies - MAX_JIFFY_OFFSET,
for 32-bit we will have:
Past Future
++++++++++++++++++++++++++++++++++++++++++++++++++++
| 49 days | 49 days | 99 days |
++++++++++++++^+++++++++++^+++++++++++++++++++++++++
^ ^
DELAY+PROBE | |
mint now
- used/confirmed times should be up to 49 days behind jiffies but
we have 49 days to stay in timer without correcting them,
so they can go up to 99 days in the past before going in
the future and trigger the problem
- as we avoid the checks in neigh_timer_handler to save CPU cycles,
one needs crazy sysctl settings to keep the timer in DELAY+PROBE
states for 49 days. With default settings, it is no more than
half minute. In this case even
mint = jiffies - LONG_MAX + 86400 * HZ should work.
- REACHABLE state extends while confirmed time advances,
otherwise PROBE will need ARP reply to recheck the
times in neigh_add_timer while entering REACHABLE again
Regards
--
Julian Anastasov <ja@....bg>
Powered by blists - more mailing lists