lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d080507b-a5a8-aa8f-182f-1182a5405ec0@gmail.com>
Date:   Tue, 26 Oct 2021 08:50:24 -0600
From:   David Ahern <dsahern@...il.com>
To:     James Prestwood <prestwoj@...il.com>, netdev@...r.kernel.org
Cc:     davem@...emloft.net, kuba@...nel.org, corbet@....net,
        yoshfuji@...ux-ipv6.org, dsahern@...nel.org, roopa@...dia.com,
        daniel@...earbox.net, vladimir.oltean@....com, idosch@...dia.com,
        nikolay@...dia.com, yajun.deng@...ux.dev, zhutong@...zon.com,
        johannes@...solutions.net, jouni@...eaurora.org
Subject: Re: [RESEND PATCH v7 1/3] net: arp: introduce arp_evict_nocarrier
 sysctl parameter

On 10/25/21 10:45 AM, James Prestwood wrote:
> This change introduces a new sysctl parameter, arp_evict_nocarrier.
> When set (default) the ARP cache will be cleared on a NOCARRIER event.
> This new option has been defaulted to '1' which maintains existing
> behavior.
> 
> Clearing the ARP cache on NOCARRIER is relatively new, introduced by:
> 
> commit 859bd2ef1fc1110a8031b967ee656c53a6260a76
> Author: David Ahern <dsahern@...il.com>
> Date:   Thu Oct 11 20:33:49 2018 -0700
> 
>     net: Evict neighbor entries on carrier down
> 
> The reason for this changes is to prevent the ARP cache from being
> cleared when a wireless device roams. Specifically for wireless roams
> the ARP cache should not be cleared because the underlying network has not
> changed. Clearing the ARP cache in this case can introduce significant
> delays sending out packets after a roam.
> 
> A user reported such a situation here:
> 
> https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/
> 
> After some investigation it was found that the kernel was holding onto
> packets until ARP finished which resulted in this 1 second delay. It
> was also found that the first ARP who-has was never responded to,
> which is actually what caues the delay. This change is more or less
> working around this behavior, but again, there is no reason to clear
> the cache on a roam anyways.
> 
> As for the unanswered who-has, we know the packet made it OTA since
> it was seen while monitoring. Why it never received a response is
> unknown. In any case, since this is a problem on the AP side of things
> all that can be done is to work around it until it is solved.
> 
> Some background on testing/reproducing the packet delay:
> 
> Hardware:
>  - 2 access points configured for Fast BSS Transition (Though I don't
>    see why regular reassociation wouldn't have the same behavior)
>  - Wireless station running IWD as supplicant
>  - A device on network able to respond to pings (I used one of the APs)
> 
> Procedure:
>  - Connect to first AP
>  - Ping once to establish an ARP entry
>  - Start a tcpdump
>  - Roam to second AP
>  - Wait for operstate UP event, and note the timestamp
>  - Start pinging
> 
> Results:
> 
> Below is the tcpdump after UP. It was recorded the interface went UP at
> 10:42:01.432875.
> 
> 10:42:01.461871 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
> 10:42:02.497976 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
> 10:42:02.507162 ARP, Reply 192.168.254.1 is-at ac:86:74:55:b0:20, length 46
> 10:42:02.507185 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 1, length 64
> 10:42:02.507205 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 2, length 64
> 10:42:02.507212 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 3, length 64
> 10:42:02.507219 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 4, length 64
> 10:42:02.507225 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 5, length 64
> 10:42:02.507232 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 6, length 64
> 10:42:02.515373 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 1, length 64
> 10:42:02.521399 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 2, length 64
> 10:42:02.521612 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 3, length 64
> 10:42:02.521941 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 4, length 64
> 10:42:02.522419 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 5, length 64
> 10:42:02.523085 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 6, length 64
> 
> You can see the first ARP who-has went out very quickly after UP, but
> was never responded to. Nearly a second later the kernel retries and
> gets a response. Only then do the ping packets go out. If an ARP entry
> is manually added prior to UP (after the cache is cleared) it is seen
> that the first ping is never responded to, so its not only an issue with
> ARP but with data packets in general.
> 
> As mentioned prior, the wireless interface was also monitored to verify
> the ping/ARP packet made it OTA which was observed to be true.
> 
> Signed-off-by: James Prestwood <prestwoj@...il.com>
> ---
>  Documentation/networking/ip-sysctl.rst |  9 +++++++++
>  include/linux/inetdevice.h             |  2 ++
>  include/uapi/linux/ip.h                |  1 +
>  include/uapi/linux/sysctl.h            |  1 +
>  net/ipv4/arp.c                         | 11 ++++++++++-
>  net/ipv4/devinet.c                     |  4 ++++
>  6 files changed, 27 insertions(+), 1 deletion(-)
> 

Reviewed-by: David Ahern <dsahern@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ