[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1d8722e-5660-c38e-848f-3220d642889d@huawei.com>
Date: Sun, 29 Jan 2023 11:08:38 +0800
From: Zhang Changzhong <zhangchangzhong@...wei.com>
To: Network Development <netdev@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
CC: Julian Anastasov <ja@....bg>,
"David S. Miller" <davem@...emloft.net>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
"Denis V. Lunev" <den@...nvz.org>,
Nikolay Aleksandrov <razor@...ckwall.org>,
Daniel Borkmann <daniel@...earbox.net>,
Zhang Changzhong <zhangchangzhong@...wei.com>,
YueHaibing <yuehaibing@...wei.com>
Subject: [Question] neighbor entry doesn't switch to the STALE state after the
reachable timer expires
Hi,
We got the following weird neighbor cache entry on a machine that's been running for over a year:
172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE
350520 seconds have elapsed since this entry was last updated, but it is still in the REACHABLE
state (base_reachable_time_ms is 30000), preventing lladdr from being updated through probe.
After some analysis, we found a scenario that may cause such a neighbor entry:
Entry used DELAY_PROBE_TIME expired
NUD_STALE ------------> NUD_DELAY ------------------------> NUD_PROBE
|
| DELAY_PROBE_TIME not expired
v
NUD_REACHABLE
The neigh_timer_handler() use time_before_eq() to compare 'now' with 'neigh->confirmed +
NEIGH_VAR(neigh->parms, DELAY_PROBE_TIME)', but time_before_eq() only works if delta < ULONG_MAX/2.
This means that if an entry stays in the NUD_STALE state for more than ULONG_MAX/2 ticks, it enters
the NUD_RACHABLE state directly when it is used again and cannot be switched to the NUD_STALE state
(the timer is set too long).
On 64-bit machines, ULONG_MAX/2 ticks are a extremely long time, but in my case (32-bit machine and
kernel compiled with CONFIG_HZ=250), ULONG_MAX/2 ticks are about 99.42 days, which is possible in
reality.
Does anyone have a good idea to solve this problem? Or are there other scenarios that might cause
such a neighbor entry?
-----
Best Regards,
Changzhong Zhang
Powered by blists - more mailing lists