lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a0734a8e-5681-4fd1-8cf0-bcb63a43f897@lunn.ch>
Date: Tue, 19 Sep 2023 14:36:58 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Johannes Berg <johannes@...solutions.net>
Cc: netdev@...r.kernel.org, linux-wireless@...r.kernel.org
Subject: Re: netif_carrier_on() race

> All of this makes sense since you need to hold RTNL for all those state
> changes/notifier chains, but it does lead to the first race/consistency
> problem: if you query at just the right time you can see carrier being
> on, however, if the carrier is actually removed again and the linkwatch
> work didn't run yet, there might never be an event for the carrier on,
> iow, you might have:
> 
>  netif_carrier_on()
>  query from userspace and see carrier on
>  netif_carrier_off()
>  linkwatch work runs and sends only carrier off event
> 
> This is at least a bit confusing, but not really my main problem here,
> though it did in fact happen to me as well, in a fashion.

That is interesting. Copper Ethernet PHYs might have the opposite
problem. The status bit about link is latching low. This means that if
the link is lost and then very quickly comes back again, you always
get to see the lost and then restored link. So maybe we have:

  netif_carrier_off()
  query from userspace and see carrier off
  netif_carrier_oon()
  linkwatch work runs and sends only carrier on event

???

Maybe we want linkwatch to keep the old and the new state. If there is
a change reported while there is still work queued, which flips a bit
back to its old state, it needs to block until the work is actually
done?

> Possible solution #2:
> ---------------------
> Another - more feasible - option might be to say OK, so the associated
> event (and a few friends) are the problem, so we can queue those events
> in cfg80211, and only release them on NETDEV_CHANGE notifier call.
> That's from netdev_state_change() after dev_activate() in linkwatch
> work. We'd want to pair it with netif_carrier_on() so we actually expect
> the event to come, and maybe give netif_carrier_on() a return value
> indicating that it scheduled - or else check using carrier_up_count
> perhaps?

Probably not an issue with 802.11, but sometimes drivers do odd things
with the carrier because of the BMC. The BMC can have a side channel
into the hosts NIC, which allows it to share the hosts PHY and RJ45
socket. So that the BMC can send/receive frames while the host NIC is
admin down, the carrier might actually be up. And requests to down the
carrier are ignored.

As i said, probably irrelevant to 802.11, but if you try to make a
generic solution, you might need to watch out for this.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ