linux-kernel - Re: [PATCH 1/2] devcoredump: Remove devcoredump device if failing device is gone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <37eb0ef396054d3e74b390c6d2a29f08c8c5fd32.camel@intel.com>
Date: Mon, 29 Jan 2024 15:50:46 +0000
From: "Souza, Jose" <jose.souza@...el.com>
To: "Vivi, Rodrigo" <rodrigo.vivi@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
CC: "maarten.lankhorst@...ux.intel.com" <maarten.lankhorst@...ux.intel.com>,
	"johannes@...solutions.net" <johannes@...solutions.net>, "rafael@...nel.org"
	<rafael@...nel.org>, "gregkh@...uxfoundation.org"
	<gregkh@...uxfoundation.org>
Subject: Re: [PATCH 1/2] devcoredump: Remove devcoredump device if failing
 device is gone

On Fri, 2024-01-26 at 10:11 -0500, Rodrigo Vivi wrote:
> Make dev_coredumpm a real device managed helper, that not only
> frees the device after a scheduled delay (DEVCD_TIMEOUT), but
> also when the failing/crashed device is gone.
> 
> The module remove for the drivers using devcoredump are currently
> broken if attempted between the crash and the DEVCD_TIMEOUT, since
> the symbolic sysfs link won't be deleted.
> 
> On top of that, for PCI devices, the unbind of the device will
> call the pci .remove void function, that cannot fail. At that
> time, our device is pretty much gone, but the read and free
> functions are alive trough the devcoredump device and they
> can get some NULL dereferences or use after free.
> 
> So, if the failing-device is gone let's also request for the
> devcoredump-device removal using the same mod_delayed_work
> as when writing anything through data. The flush cannot be
> used since it is synchronous and the devcd would be surely
> gone right before the mutex_unlock on the next line.
> 
> 
> 

Reviewed-by: José Roberto de Souza <jose.souza@...el.com>

> Cc: Jose Souza <jose.souza@...el.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>
> Cc: Johannes Berg <johannes@...solutions.net>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Cc: Rafael J. Wysocki <rafael@...nel.org>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@...el.com>
> ---
>  drivers/base/devcoredump.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
> index 7e2d1f0d903a..678ecc2fa242 100644
> --- a/drivers/base/devcoredump.c
> +++ b/drivers/base/devcoredump.c
> @@ -304,6 +304,19 @@ static ssize_t devcd_read_from_sgtable(char *buffer, loff_t offset,
>  				  offset);
>  }
>  
> +static void devcd_remove(void *data)
> +{
> +	struct devcd_entry *devcd = data;
> +
> +	mutex_lock(&devcd->mutex);
> +	if (!devcd->delete_work) {
> +		devcd->delete_work = true;
> +		/* XXX: Cannot flush otherwise the mutex below will hit a UAF */
> +		mod_delayed_work(system_wq, &devcd->del_wk, 0);
> +	}
> +	mutex_unlock(&devcd->mutex);
> +}
> +
>  /**
>   * dev_coredumpm - create device coredump with read/free methods
>   * @dev: the struct device for the crashed device
> @@ -381,6 +394,8 @@ void dev_coredumpm(struct device *dev, struct module *owner,
>  	kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD);
>  	INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
>  	schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
> +	if (devm_add_action(dev, devcd_remove, devcd))
> +		dev_warn(dev, "devcoredump managed auto-removal registration failed\n");
>  	mutex_unlock(&devcd->mutex);
>  	return;
>   put_device: