lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <65vs7a63onl37a7q7vjxo7wgmgkdcixkittcrirdje2e6qmkkj@syujqrygyvcd>
Date: Mon, 1 Dec 2025 03:35:04 -0800
From: Breno Leitao <leitao@...ian.org>
To: Andre Carvalho <asantostc@...il.com>
Cc: Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>, 
	Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next v8 4/5] netconsole: resume previously
 deactivated target

Hello Andre,

On Fri, Nov 28, 2025 at 10:08:03PM +0000, Andre Carvalho wrote:
> @@ -242,6 +249,75 @@ static void populate_configfs_item(struct netconsole_target *nt,
>  }
>  #endif	/* CONFIG_NETCONSOLE_DYNAMIC */
>  
> +/* Check if the target was bound by mac address. */
> +static bool bound_by_mac(struct netconsole_target *nt)
> +{
> +	return is_valid_ether_addr(nt->np.dev_mac);
> +}
> +
> +/* Attempts to resume logging to a deactivated target. */
> +static void resume_target(struct netconsole_target *nt)
> +{
> +	int ret;
> +
> +	/* check if target is still deactivated as it may have been disabled
> +	 * while resume was being scheduled.
> +	 */
This only happens if this is a dynamic target and someone is toggling
the device (or even removing it, which would cause a crash I _think_).

Given you are completely lockless here, so, there is a chance you hit
a TOCTOU, also.

I think you want to have dynamic_netconsole_mutex held during the
operation of process_resume_target().

  * mutex_lock(&dynamic_netconsole_mutex);
  * remove from the list
  * resume
  * re-add to the list
  * mutex_unlock(&dynamic_netconsole_mutex);
  

netconsole design has two locks:
  * target lock list, which protects devices getting disabled by netdev
    notifications
  * dynamic_netconsole_mutex, which protects anyone disabling and
    removing the target from configfs

> +	if (nt->state != STATE_DEACTIVATED)
> +		return;
> +
> +	if (bound_by_mac(nt))
> +		/* ensure netpoll_setup will retrieve device by mac */
> +		memset(&nt->np.dev_name, 0, IFNAMSIZ);

This is a clean-up step that was missing whent the target is getting
down, and htis is just a work around that doesn't belong in here.

Please move it to netconsole_process_cleanups_core(), in a separate
patch.

Something as: 

	list_for_each_entry_safe(nt, tmp, &target_cleanup_list, list)
		do_netpoll_cleanup(&nt->np);
		if (bound_by_mac(nt))
			memset(&nt->np.dev_name, 0, IFNAMSIZ);
			

Ideally this should belong to do_netpoll_cleanup(), but let's keep it in
netconsole_process_cleanups_core() for three reasons:


1) Bounding by mac is a netconsole concept
2) do_netpoll_cleanup() is only used by netconsole, and I plan to move
   it back to netconsole. Some PoC in [1]
3) bound_by_mac() should be in netconsole and we do not want to export
   it.

[1]:
https://lore.kernel.org/all/20250902-netpoll_untangle_v3-v1-3-51a03d6411be@debian.org/

> +
> +	ret = netpoll_setup(&nt->np);
> +	if (ret) {
> +		/* netpoll fails setup once, do not try again. */
> +		nt->state = STATE_DISABLED;
> +		return;
> +	}
> +
> +	nt->state = STATE_ENABLED;
> +	pr_info("network logging resumed on interface %s\n", nt->np.dev_name);
> +}
> +
> +/* Checks if a deactivated target matches a device. */
> +static bool deactivated_target_match(struct netconsole_target *nt,
> +				     struct net_device *ndev)
> +{
> +	if (nt->state != STATE_DEACTIVATED)
> +		return false;
> +
> +	if (bound_by_mac(nt))
> +		return !memcmp(nt->np.dev_mac, ndev->dev_addr, ETH_ALEN);
> +	return !strncmp(nt->np.dev_name, ndev->name, IFNAMSIZ);
> +}
> +
> +/* Process work scheduled for target resume. */
> +static void process_resume_target(struct work_struct *work)
> +{
> +	struct netconsole_target *nt =
> +		container_of(work, struct netconsole_target, resume_wq);
> +	unsigned long flags;
> +

mutex_lock(&dynamic_netconsole_mutex);
As discussed above

> +	/* resume_target is IRQ unsafe, remove target from
> +	 * target_list in order to resume it with IRQ enabled.
> +	 */
> +	spin_lock_irqsave(&target_list_lock, flags);
> +	list_del_init(&nt->list);
> +	spin_unlock_irqrestore(&target_list_lock, flags);
> +
> +	resume_target(nt);
> +
> +	/* At this point the target is either enabled or disabled and
> +	 * was cleaned up before getting deactivated. Either way, add it
> +	 * back to target list.
> +	 */
> +	spin_lock_irqsave(&target_list_lock, flags);
> +	list_add(&nt->list, &target_list);
> +	spin_unlock_irqrestore(&target_list_lock, flags);

mutex_unlock(&dynamic_netconsole_mutex);

> +}
> +
>  /* Allocate and initialize with defaults.
>   * Note that these targets get their config_item fields zeroed-out.
>   */
> @@ -264,6 +340,7 @@ static struct netconsole_target *alloc_and_init(void)
>  	nt->np.remote_port = 6666;
>  	eth_broadcast_addr(nt->np.remote_mac);
>  	nt->state = STATE_DISABLED;
> +	INIT_WORK(&nt->resume_wq, process_resume_target);

It needs to be initialized earlier before the kzalloc, otherwise we
might hit a similar problem to the one fixed by e5235eb6cfe0  ("net:
netpoll: initialize work queue before error checks")

The code path would be:
  * alloc_param_target()
	  * alloc_and_init()
		  * kzalloc() fails and return NULL.
		  * resume_wq() is still not initialized
  fail:
	* free_param_target()
		* cancel_work_sync(&nt->resume_wq); and resume_wq is not
		  initialized

Thanks for the patch,
--breno

--
pw-bot: cr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ