lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aea60fac-1046-4e15-8392-890812ee5521@intel.com>
Date: Fri, 19 Dec 2025 09:05:10 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Aaron Tomlin <atomlin@...mlin.com>
CC: <tony.luck@...el.com>, <Dave.Martin@....com>, <james.morse@....com>,
	<babu.moger@....com>, <tglx@...utronix.de>, <mingo@...hat.com>,
	<bp@...en8.de>, <dave.hansen@...ux.intel.com>, <sean@...e.io>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/3] fs/resctrl: Add helpers to check io_alloc support
 and enabled state

Hi Aaron,

On 12/18/25 3:22 PM, Aaron Tomlin wrote:
> On Tue, Dec 16, 2025 at 09:01:32PM -0800, Reinette Chatre wrote:
>> How does this patch benefit the goal of this submission, which is to set
>> identical CBM on all domains?
>>
>> If this change benefited this submission I should find these new helpers
>> used in later patches but they are not. This seems to be an unrelated and
>> unnecessary change mixed with this submission. It does not fix a bug nor
>> does it support this submission and thus just adds noise when considering
>> the new feature for inclusion. This patch can be dropped. 
> 
> Hi Reinette,
> 
> Thank you for your feedback regarding the relevance of this patch to the
> overall objective.
> 
> To be clear, this change was included as part of the broader series. Whilst
> I grant that this patch is not a strong requirement, I deem it a useful
> prerequisite clean-up. In my view, streamlining the existing infrastructure
> before layering on new features is a sound practice that prevents the
> accumulation of technical debt.

You are correct that new features should be layered on a clean foundation. One
problem is that these two patches are independent. This change is misrepresented
as a pre-requisite of the new feature. Presenting it in this way impacts how the
feature itself is considered.

> 
> Given that this refactoring provides a cleaner foundation, I would propose
> that it remain in the series rather than being excluded. I believe the
> long-term benefit to the maintainability of the resctrl code outweighs the
> concern of it being "noise" in the context of this specific feature.

I have a different view on how this patch impacts maintainability. There are several
techniques to help developers. For example, while the new functions introduced in this
patch intend to be helpful to developer by including the comment "This function must
be called under the cpu hotplug lock and and rdtgroup mutex" the right way to communicate
this is to use lockdep_assert_cpus_held() and lockdep_assert_held(&rdtgroup_mutex). Existence
of these tools demonstrate that even while knowing what the right thing to do is, mistakes
can still appear.

Something else required by these new functions is that rdt_last_cmd_clear() needs to
be called beforehand. This is not possible to automate like the examples above and
thus relies on developer to "get right". As one that have seen many patches flow into
resctrl I can say with confidence that this is one of the things where there are often
mistakes.

With that in mind, note how every hunk includes rdt_last_cmd_clear() followed by the
rdt_last_cmd_printf() being moved. How the buffer is used cannot be more clear, right?
This patch adds a layer of indirection that makes this relationship more difficult to see.
It thus does not simplify how to reason about this code.

Surely, rdt_last_cmd_clear() is not required to be a few lines away from a call that
writes to the buffer but having these calls in the same function/scope makes it obvious
where the buffer is cleared and where data is written to it. 

Also, as resctrl documentation states about the "last_cmd_status" file: "If the command
failed, it will provide more information that can be conveyed in the error returns from
file operations.". Now, while keeping this in mind, consider, for example how
resctrl_io_alloc_write() appears to developer before and after this change. The current
implementation is consistent: every time there is a failure it is accompanied by a
write to last_cmd_status buffer to make sure the error details are conveyed to user space.
After this change the function is inconsistent: some errors result in a print to
last_cmd_status and some do not. It is not that the print to last_cmd_status is removed
but it is behind another layer of indirection that makes resctrl_io_alloc_write()
more difficult to read. A developer can no longer just look at resctrl_io_alloc_write()
and learn how it interacts with user space.

Same for the error codes. It is important to know and be consistent which error codes
are returned to user space. Adding these behind another layer of indirection where that
is all that function does seems unnecessary to me.

In summary, no, I do not see how this change benefits maintainability.

Reinette

ps. I will be offline over the holidays and may only be able to continue this discussion
in the next year.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ