linux-kernel - Re: [PATCH 2/6] selftests/resctrl: Ensure measurements skip initialization of default benchmark

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e5078b5-509f-38ef-fa71-821c5f7b5944@linux.intel.com>
Date: Fri, 6 Sep 2024 13:00:56 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Reinette Chatre <reinette.chatre@...el.com>
cc: fenghua.yu@...el.com, shuah@...nel.org, tony.luck@...el.com, 
    peternewman@...gle.com, babu.moger@....com, 
    Maciej Wieczór-Retman <maciej.wieczor-retman@...el.com>, 
    linux-kselftest@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/6] selftests/resctrl: Ensure measurements skip
 initialization of default benchmark

On Thu, 5 Sep 2024, Reinette Chatre wrote:
> On 9/5/24 5:10 AM, Ilpo Järvinen wrote:
> > On Wed, 4 Sep 2024, Reinette Chatre wrote:
> > > On 9/4/24 4:57 AM, Ilpo Järvinen wrote:
> > > > On Fri, 30 Aug 2024, Reinette Chatre wrote:
> > > > > On 8/30/24 3:56 AM, Ilpo Järvinen wrote:
> > > > > > On Thu, 29 Aug 2024, Reinette Chatre wrote:
> > 
> > > > > > > @@ -699,111 +639,80 @@ int resctrl_val(const struct resctrl_test
> > > > > > > *test,
> > > > > > >     		return ret;
> > > > > > >     	}
> > > > > > >     -	/*
> > > > > > > -	 * If benchmark wasn't successfully started by child, then
> > > > > > > child
> > > > > > > should
> > > > > > > -	 * kill parent, so save parent's pid
> > > > > > > -	 */
> > > > > > >     	ppid = getpid();
> > > > > > >     -	if (pipe(pipefd)) {
> > > > > > > -		ksft_perror("Unable to create pipe");
> > > > > > > +	/* Taskset test to specified CPU. */
> > > > > > > +	ret = taskset_benchmark(ppid, uparams->cpu, &old_affinity);
> > > > > > 
> > > > > > Previously only CPU affinity for bm_pid was set but now it's set
> > > > > > before
> > > > > > fork(). Quickly checking the Internet, it seems that CPU affinity
> > > > > > gets
> > > > > > inherited on fork() so now both processes will have the same
> > > > > > affinity
> > > > > > which might make the other process to interfere with the
> > > > > > measurement.
> > > > > 
> > > > > Setting the affinity is intended to ensure that the buffer preparation
> > > > > occurs in the same topology as where the runtime portion will run.
> > > > > This preparation is done before the work to be measured starts.
> > > > > 
> > > > > This does tie in with the association with the resctrl group and I
> > > > > will elaborate more below ...
> > > > 
> > > > Okay, that's useful to retain but thinking this further, now we're also
> > > > going do non-trivial amount of work in between the setup and the test by
> > > 
> > > Could you please elaborate how the amount of work during setup can be an
> > > issue? I have been focused on the measurements that are done afterwards
> > > that do have clear boundaries from what I can tell.
> > 
> > Well, you used it as a justification: "Setting the affinity is intended
> > to ensure that the buffer preparation occurs in the same topology as where
> > the runtime portion will run." So I assumed you had some expectations about
> > "preparations" done outside of those "clear boundaries" but now you seem
> > to take entirely opposite stance?
> 
> I do not follow you here. With the "clear boundaries" I meant the
> measurements and associated variables that have  a clear start/init and
> stop/read while the other task sleeps. Yes, preparations are done outside
> of this since that should not be measured.

Of course the preparations are not measured (at least not after this
patch :-)).

But that's not what I meant. You said the preparations are to be done 
using the same topology as the test but if it's outside of the measurement 
period, the topology during preparations only matters if you make some
hidden assumption that some state carries from preparations to the actual 
test. If no state carry-over is assumed, the topology during preparations
is not important. So which way it is? Is the topology during preparations 
important or not?

You used the topology argument to justify why both parent and child are 
now in the same resctrl group unlike before when only the actual test was.

> You stated "now we're also going
> do non-trivial amount of work in between the setup and the test" ...
> could you clarify what the problem is with this? Before this work
> the "non-trivial amount of work" (for "fill_buf") was done as part of the
> test with (wrong) guesses about how long it takes. This work aims to improve
> this.

I understand why you're trying to do with this change.

However, I was trying to say that before preparations occurred right 
before the test which is no longer the case because there's fork() in 
between.

Does that extra work impact the state carry-over from preparations to the 
test?

> > fork() quite heavy operation as it has to copy various things including
> > the address space which you just made to contain a huge mem blob. :-)
> 
> As highlighted in a comment found in the patch, the kernel uses copy-on-write
> and all tests only read from the buffer after fork().

Write test is possible using -b fill_buf ... as mentioned in comments to 
the other patch.

> > BTW, perhaps we could use some lighter weighted fork variant in the
> > resctrl selftests, the processes don't really need to be that separated
> > to justify using full fork() (and custom benchmarks will do execvp()).
> 
> Are you thinking about pthreads? It is not clear to me that this is
> necessary. It is also not clear to me what problem you are describing that
> needs this solution. When I understand that better it will be easier to
> discuss solutions.

I was trying to say I see advantage of retaining the address space which
is not possible with fork(), so perhaps using clone() with CLONE_VM would 
be useful and maybe some other flags too. I think this would solve the 
write test case.

> > > > I was thinking if I should note the amount of work is small. Maybe it's
> > > > fine to leave that noise there and I'm just overly cautious :-), when I
> > > > used to do networking research in the past life, I wanted to eliminate
> > > > as
> > > > much noise sources so I guess it comes from that background.
> > > 
> > > The goal of these tests are to verify *resctrl*, these are not intended to
> > > be
> > > hardware validation tests. I think it would be better for resctrl if more
> > > time
> > > is spent on functional tests of resctrl than these performance tests.
> > 
> > This sounds so easy... (no offence) :-) If only there wouldn't be the
> > black boxes and we'd have good and fine-grained ways to instrument it,
> > it would be so much easier to realize non-statistical means to do
> > functional tests.
> > 
> 
> To me functional tests of resctrl indeed sounds easier. Tests can interact
> with the
> resctrl interface to ensure it works as expected ... test the boundaries
> of user input to the various files, test task movement between groups, test
> moving of
> monitor groups, test assigning CPUs to resource groups, ensure the "mode" of a
> resource group can be changed and behaves as expected (for example, shared vs
> exclusive),
> ensure changes to one file are reflected in others, like changing schemata is
> reflected
> in "size" and "bit_usage", etc. etc. These are all tests of *resctrl* that
> supports
> development and can be used to verify all is well as major changes (that we
> are seeing
> more and more of) are made to the kernel subsystem. None of this is "black
> box" and
> is all deterministic with obvious pass/fail. This can be made even more
> reliable with
> not just checking of resctrl files but see if changes via resctrl is reflected
> in MSRs
> as expected.

Okay, I get it now, you meant testing the kernel-userspace interfaces 
and with using MSRs as a further validation step to test kernel-HW 
interface too.

I'll probably take a look at this when I've some spare time what I can 
come up with.

-- 
 i.