lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f68ace7-e05b-ad6d-fa74-5ff8e179aec9@intel.com>
Date:   Fri, 14 Jul 2023 14:54:17 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
        <mingo@...hat.com>, <bp@...en8.de>
CC:     <fenghua.yu@...el.com>, <dave.hansen@...ux.intel.com>,
        <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
        <akpm@...ux-foundation.org>, <quic_neeraju@...cinc.com>,
        <rdunlap@...radead.org>, <damien.lemoal@...nsource.wdc.com>,
        <songmuchun@...edance.com>, <peterz@...radead.org>,
        <jpoimboe@...nel.org>, <pbonzini@...hat.com>,
        <chang.seok.bae@...el.com>, <pawan.kumar.gupta@...ux.intel.com>,
        <jmattson@...gle.com>, <daniel.sneddon@...ux.intel.com>,
        <sandipan.das@....com>, <tony.luck@...el.com>,
        <james.morse@....com>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <bagasdotme@...il.com>,
        <eranian@...gle.com>, <christophe.leroy@...roup.eu>,
        <jarkko@...nel.org>, <adrian.hunter@...el.com>,
        <quic_jiles@...cinc.com>, <peternewman@...gle.com>
Subject: Re: [PATCH v5 7/8] x86/resctrl: Move default control group creation
 during mount

Hi Babu,

On 7/14/2023 9:26 AM, Moger, Babu wrote:
> Hi Reinette,
> Sorry.. Took a while to respond. I had to recreate the issue to refresh my
> memory.

No problem!

> On 7/7/23 16:46, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 6/1/2023 12:02 PM, Babu Moger wrote:


>>>  	ctx = kzalloc(sizeof(struct rdt_fs_context), GFP_KERNEL);
>>> -	if (!ctx)
>>> +	if (!ctx) {
>>> +		kernfs_destroy_root(rdt_root);
>>>  		return -ENOMEM;
>>> +	}
>>>  
>>>  	ctx->kfc.root = rdt_root;
>>>  	ctx->kfc.magic = RDTGROUP_SUPER_MAGIC;
>>> @@ -2845,6 +2860,9 @@ static void rdt_kill_sb(struct super_block *sb)
>>>  	static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
>>>  	static_branch_disable_cpuslocked(&rdt_mon_enable_key);
>>>  	static_branch_disable_cpuslocked(&rdt_enable_key);
>>> +	/* Remove the default group and cleanup the root */
>>> +	list_del(&rdtgroup_default.rdtgroup_list);
>>> +	kernfs_destroy_root(rdt_root);
>>
>> Why not just add kernfs_remove(rdtgroup_default.kn) to rmdir_all_sub()?
> 
> List rdtgroup_default.rdtgroup_list is added during the mount and had to
> be removed during umount and rdt_root is destroyed here.

I do not think it is required for default resource group management to
be tied with the resctrl files associated with default resource group.

I think rdtgroup_setup_root can be split in two, one for all the
resctrl files that should be done at mount/unmount and one for the
default group init done at __init.

>>>  	kernfs_kill_sb(sb);
>>>  	mutex_unlock(&rdtgroup_mutex);
>>>  	cpus_read_unlock();
>>> @@ -3598,10 +3616,8 @@ static struct kernfs_syscall_ops rdtgroup_kf_syscall_ops = {
>>>  	.show_options	= rdtgroup_show_options,
>>>  };
>>>  
>>> -static int __init rdtgroup_setup_root(void)
>>> +static int rdtgroup_setup_root(void)
>>>  {
>>> -	int ret;
>>> -
>>>  	rdt_root = kernfs_create_root(&rdtgroup_kf_syscall_ops,
>>>  				      KERNFS_ROOT_CREATE_DEACTIVATED |
>>>  				      KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK,
>>> @@ -3618,19 +3634,11 @@ static int __init rdtgroup_setup_root(void)
>>>  
>>>  	list_add(&rdtgroup_default.rdtgroup_list, &rdt_all_groups);
>>>  
>>> -	ret = rdtgroup_add_files(kernfs_root_to_node(rdt_root), RFTYPE_CTRL_BASE);
>>> -	if (ret) {
>>> -		kernfs_destroy_root(rdt_root);
>>> -		goto out;
>>> -	}
>>> -
>>>  	rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
>>> -	kernfs_activate(rdtgroup_default.kn);
>>>  
>>> -out:
>>>  	mutex_unlock(&rdtgroup_mutex);
>>>  
>>> -	return ret;
>>> +	return 0;
>>>  }
>>>  
>>>  static void domain_destroy_mon_state(struct rdt_domain *d)
>>> @@ -3752,13 +3760,9 @@ int __init rdtgroup_init(void)
>>>  	seq_buf_init(&last_cmd_status, last_cmd_status_buf,
>>>  		     sizeof(last_cmd_status_buf));
>>>  
>>> -	ret = rdtgroup_setup_root();
>>> -	if (ret)
>>> -		return ret;
>>> -
>>>  	ret = sysfs_create_mount_point(fs_kobj, "resctrl");
>>>  	if (ret)
>>> -		goto cleanup_root;
>>> +		return ret;
>>>  
>>
>> It is not clear to me why this change is required, could you
>> please elaborate? It seems that all that is needed is for 
>> rdtgroup_add_files() to move to rdt_get_tree() (which you have done)
>> and then an additional call to kernfs_remove() in rmdir_all_sub().
>> I must be missing something, could you please help me understand?
>>
> 
> Yes. I started with that approach. But there are issues with that approach.
> 
> Currently, rdt_root(which is rdtgroup_default.kn) is created during
> rdtgroup_init. At the same time the root files are created. Also, default
> group is added to rdt_all_groups. Basically, the root files and
> rdtgroup_default group is always there even though filesystem is never
> mounted. Also mbm_over and cqm_limbo workqueues are always running even
> though filesystem is not mounted.
> 
> I changed rdtgroup_add_files() to move to rdt_get_tree() and added
> kernfs_remove() in rmdir_all_sub(). This caused problems. The
> kernfs_remove(rdtgroup_default.kn) removes all the reference counts and
> releases the root. When we mount again, we hit this this problem below.
> 
> [  404.558461] ------------[ cut here ]------------
> [  404.563631] WARNING: CPU: 35 PID: 7728 at fs/kernfs/dir.c:522
> kernfs_new_node+0x63/0x70
> 
> 404.778793]  ? __warn+0x81/0x140
> [  404.782535]  ? kernfs_new_node+0x63/0x70
> [  404.787036]  ? report_bug+0x102/0x200
> [  404.791247]  ? handle_bug+0x3f/0x70
> [  404.795269]  ? exc_invalid_op+0x13/0x60
> [  404.799671]  ? asm_exc_invalid_op+0x16/0x20
> [  404.804461]  ? kernfs_new_node+0x63/0x70
> [  404.808954]  ? snprintf+0x49/0x70
> [  404.812762]  __kernfs_create_file+0x30/0xc0
> [  404.817534]  rdtgroup_add_files+0x6c/0x100
> 
> Basically kernel says your rdt_root is not initialized. That is the reason
> I had to move everything to mount time. The rdt_root is created and
> initialized during the mount and also destroyed during the umount.
> And I had to move rdt_enable_key check during rdt_root creation.
> 

ok, thank you for the additional details. I see now how this patch evolved.
I understand how rdt_root needs to be created/destroyed
during mount/unmount. If I understand correctly the changes to
rdt_init_fs_context() was motivated by this line:

	ctx->kfc.root = rdt_root;

... that prompted you to move rdt_root creation there in order to have
it present for this assignment and that prompted the
rdt_enable_key check to follow. Is this correct?

I am concerned about the changes to rdt_init_fs_context() since it further
separates the resctrl file management, it breaks the symmetry of the
key checked and set, and finally these new actions seem unrelated to a function
named "init_fs_context". I looked at other examples and from what I can tell
it is not required that ctx->kfc.root be initialized within
rdt_init_fs_context(). Looks like the value is required by kernfs_get_tree()
that is called from rdt_get_tree(). For comparison I found cgroup_do_get_tree().
Note how cgroup_do_get_tree(), within the .get_tree callback,
initializes kernfs_fs_context.root and then call kernfs_get_tree()? 

It thus looks to me as though things can be simplified significantly
if the kernfs_fs_context.root assignment is moved from rdt_init_fs_context()
to rdt_get_tree(). rdt_get_tree() can then create rdt_root (and add all needed
files), assign it to kernfs_fs_context.root and call kernfs_get_tree().

What do you think?

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ