lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <053d8a62-022b-4bf8-8e47-651e7c3a2d59@intel.com>
Date: Thu, 6 Mar 2025 20:45:31 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: James Morse <james.morse@....com>, <x86@...nel.org>,
	<linux-kernel@...r.kernel.org>
CC: Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>, Babu Moger
	<Babu.Moger@....com>, <shameerali.kolothum.thodi@...wei.com>, "D Scott
 Phillips OS" <scott@...amperecomputing.com>, <carl@...amperecomputing.com>,
	<lcherian@...vell.com>, <bobo.shaobowang@...wei.com>,
	<tan.shaopeng@...itsu.com>, <baolin.wang@...ux.alibaba.com>, Jamie Iles
	<quic_jiles@...cinc.com>, Xin Hao <xhao@...ux.alibaba.com>,
	<peternewman@...gle.com>, <dfustini@...libre.com>, <amitsinght@...vell.com>,
	David Hildenbrand <david@...hat.com>, Rex Nie <rex.nie@...uarmicro.com>,
	"Dave Martin" <dave.martin@....com>, Koba Ko <kobak@...dia.com>, Shanker
 Donthineni <sdonthineni@...dia.com>, <fenghuay@...dia.com>, Shaopeng Tan
	<tan.shaopeng@...fujitsu.com>, Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH v7 33/49] x86/resctrl: resctrl_exit() teardown resctrl but
 leave the mount point

Hi James,

On 2/28/25 11:58 AM, James Morse wrote:
> resctrl_exit() was intended for use when the 'resctrl' module was unloaded.
> resctrl can't be built as a module, and the kernfs helpers are not exported
> so this is unlikely to change. MPAM has an error interrupt which indicates
> the MPAM driver has gone haywire. Should this occur tasks could run with
> the wrong control values, leading to bad performance for important tasks.
> The MPAM driver needs a way to tell resctrl that no further configuration
> should be attempted.
> 
> Using resctrl_exit() for this leaves the system in a funny state as
> resctrl is still mounted, but cannot be un-mounted because the sysfs
> directory that is typically used has been removed. Dave Martin suggests
> this may cause systemd trouble in the future as not all filesystems
> can be unmounted.
> 
> Add calls to remove all the files and directories in resctrl, and
> remove the sysfs_remove_mount_point() call that leaves the system
> in a funny state. When triggered, this causes all the resctrl files
> to disappear. resctrl can be unmounted, but not mounted again.
> 
> Signed-off-by: James Morse <james.morse@....com>
> Tested-by: Carl Worth <carl@...amperecomputing.com> # arm64
> Tested-by: Shaopeng Tan <tan.shaopeng@...fujitsu.com>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@...fujitsu.com>
> Reviewed-by: Tony Luck <tony.luck@...el.com>
> ---
> Changes since v6:
>  * Added kdoc and comment to resctrl_exit().
> 
> Changes since v5:
>  * Serialise rdtgroup_destroy_root() against umount().
>  * Check rdtgroup_default.kn to protect against duplicate calls.
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 2f34b7215679..0d74a6d98dba 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4094,8 +4094,12 @@ static int rdtgroup_setup_root(struct rdt_fs_context *ctx)
>  
>  static void rdtgroup_destroy_root(void)
>  {
> -	kernfs_destroy_root(rdt_root);
> -	rdtgroup_default.kn = NULL;
> +	lockdep_assert_held(&rdtgroup_mutex);
> +
> +	if (rdtgroup_default.kn) {
> +		kernfs_destroy_root(rdt_root);
> +		rdtgroup_default.kn = NULL;
> +	}
>  }
>  
>  static void __init rdtgroup_setup_default(void)
> @@ -4387,11 +4391,26 @@ int __init resctrl_init(void)
>  	return ret;
>  }
>  
> +/**
> + * resctrl_exit() - Remove the resctrl filesystem and free resources.
> + *
> + * Called by the architecture code in response to a fatal error.
> + * Resctrl files and structures are removed from kernfs to prevent further
> + * configuration.
> + */
>  void __exit resctrl_exit(void)
>  {
> +	mutex_lock(&rdtgroup_mutex);
> +	rdtgroup_destroy_root();
> +	mutex_unlock(&rdtgroup_mutex);
> +
>  	debugfs_remove_recursive(debugfs_resctrl);
>  	unregister_filesystem(&rdt_fs_type);
> -	sysfs_remove_mount_point(fs_kobj, "resctrl");
> +
> +	/*
> +	 * The sysfs mount point added by resctrl_init() is not removed so that
> +	 * it can be used to umount resctrl.
> +	 */
>  
>  	resctrl_mon_resource_exit();
>  }
(copying v6 discussion here)

On 3/6/25 11:28 AM, James Morse wrote:
> On 01/03/2025 02:35, Reinette Chatre wrote:
>> On 2/28/25 11:54 AM, James Morse wrote:
>>> On 20/02/2025 04:42, Reinette Chatre wrote:

>>>> It is difficult for me to follow the kernfs reference counting required
>>>> to make this work. Specifically, the root kn is "destroyed" here but it
>>>> is required to stick around until unmount when the rest of the files
>>>> are removed.
>>>
>>> This drops resctrl's reference to all of the files, which would make the files disappear.
>>> unmount is what calls kernfs_kill_sb(), which gets rid of the root of the filesystem.
>>
>> My concern is mostly with the kernfs_remove() calls in the rdt_kill_sb()->rmdir_all_sub()
>> flow. For example:
>> 	kernfs_remove(kn_info);
>> 	kernfs_remove(kn_mongrp);
>> 	kernfs_remove(kn_mondata);
>>
>> As I understand the above require the destroyed root to still be around.
> 
> Right - because rdt_get_tree() has these global pointers into the hierarchy, but doesn't
> take a reference. rmdir_all_sub() relies on always being called before
> rdtgroup_destroy_root().

Is this a known issue then? Since I am not able to use your test I created something new
after thinking there would be no response to my comment and indeed on unmount:

[  293.707228] BUG: KASAN: slab-use-after-free in kernfs_remove+0x87/0xa0
[  293.714718] Read of size 8 at addr ff11000309d88f30 by task umount/3793

> 
> The point hack would be for rdtgroup_destroy_root() to NULL out those global pointers, (I
> note they are left dangling) - that would make a subsequent call to rmdir_all_sub() harmless.
> 
> A better fix would be to pull out all the filesystem relevant parts from rdt_kill_sb(),
> make that safe for multiple calls and get resctrl_exit() to call that.
> A call to rdt_kill_sb() after resctrl_exit() would just cleanup the super-block.
> This will leave things in a more predictable state.

Why just the filesystem relevant parts? Although, you also state "resctrl_exit() would just
cleanup the super-block" that sounds like you are thinking about pulling out all reset work.
This sounds reasonable to me. It really feels more appropriate to do proper cleanup and
not just wipe the root while leaving everything else underneath it.

Reinette



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ