lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7dN2KpsQjVUb3KR@e133380.arm.com>
Date: Thu, 20 Feb 2025 15:44:24 +0000
From: Dave Martin <Dave.Martin@....com>
To: "Moger, Babu" <babu.moger@....com>
Cc: corbet@....net, reinette.chatre@...el.com, tglx@...utronix.de,
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
	tony.luck@...el.com, peternewman@...gle.com, x86@...nel.org,
	hpa@...or.com, paulmck@...nel.org, akpm@...ux-foundation.org,
	thuth@...hat.com, rostedt@...dmis.org, xiongwei.song@...driver.com,
	pawan.kumar.gupta@...ux.intel.com, daniel.sneddon@...ux.intel.com,
	jpoimboe@...nel.org, perry.yuan@....com, sandipan.das@....com,
	kai.huang@...el.com, xiaoyao.li@...el.com, seanjc@...gle.com,
	xin3.li@...el.com, andrew.cooper3@...rix.com, ebiggers@...gle.com,
	mario.limonciello@....com, james.morse@....com,
	tan.shaopeng@...itsu.com, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
	eranian@...gle.com
Subject: Re: [PATCH v11 22/23] x86/resctrl: Introduce interface to list
 assignment states of all the groups

Hi,

On Wed, Feb 19, 2025 at 03:09:51PM -0600, Moger, Babu wrote:
> Hi Dave,
> 
> On 2/19/25 07:53, Dave Martin wrote:
> > On Wed, Jan 22, 2025 at 02:20:30PM -0600, Babu Moger wrote:
> >> Provide the interface to list the assignment states of all the resctrl
> >> groups in mbm_cntr_assign mode.

[...]

> >> +static int resctrl_mbm_assign_control_show(struct kernfs_open_file *of,
> >> +					   struct seq_file *s, void *v)
> >> +{

[...]

> > Unlike the other resctrl files, it looks like the total size of this
> > data will scale up with the number of existing monitoring groups
> > and the lengths of the group names (in addition to the number of
> > monitoring domains).
> > 
> > So, this can easily be more than a page, overflowing internal limits
> > in the seq_file and kernfs code.
> > 
> > Do we need to track some state between read() calls?  This can be done
> > by overriding the kernfs .open() and .release() methods and hanging
> > some state data (or an rdtgroup_file pointer) on of->priv.
> > 
> > Also, if we allow the data to be read out in chunks, then we would
> > either have to snapshot all the data in one go and stash the unread
> > tail in the kernel, or we would need to move over to RCU-based
> > enumeration or similar -- otherwise releasing rdtgroup_mutex in the
> > middle of the enumeration in order to return data to userspace is going
> > to be a problem...
> 
> Good catch.
> 
> I see similar buffer overflow is handled by calling seq_buf_clear()
> (look at process_durations() or in show_user_instructions()).
> 
> How about handling this by calling rdt_last_cmd_clear() before printing
> each group?

Does this work?

Once seq_buf_has_overflowed() returns nonzero, data has been lost, no?
So far as I can see, show_user_instructions() just gives up on printing
the affected line, while process_durations() tries to anticipate
overflow and prints out the accumulated text to dmesg before clearing
the buffer.

In our case, we cannot send more data to userspace than was requested
in the read() call, so we might have nowhere to drain the seq_buf
contents to in order to free up space.

sysfs "expects" userspace to do a big enough read() that this problem
doesn't happen.  In practice this is OK because people usually read
through a buffered I/O layer like stdio, and in realistic
implementations the user-side I/O buffer is large enough to hide this
issue.

But mbm_assign_control data is dynamically generated and potentially
much bigger than a typical sysfs file.

> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 484d6009869f..1828f59eb723 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1026,6 +1026,7 @@ static int resctrl_mbm_assign_control_show(struct
> kernfs_open_file *of,
>         }
> 
>         list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
> +               rdt_last_cmd_clear();
>                 seq_printf(s, "%s//", rdtg->kn->name);
> 
>                 sep = false;
> @@ -1041,6 +1042,7 @@ static int resctrl_mbm_assign_control_show(struct
> kernfs_open_file *of,
>                 seq_putc(s, '\n');
> 
>                 list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
> mon.crdtgrp_list) {
> +                       rdt_last_cmd_clear();

I don't see how this helps.

Surely last_cmd_status has nothing to do with s?

[...]

Cheers
---Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ