linux-kernel - Re: [PATCH v4 4/4] KVM: selftests: Run dirty_log_perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y0RG2w9cHn01Af41@google.com>
Date:   Mon, 10 Oct 2022 16:22:51 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Vipin Sharma <vipinsh@...gle.com>
Cc:     pbonzini@...hat.com, dmatlack@...gle.com, andrew.jones@...ux.dev,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 4/4] KVM: selftests: Run dirty_log_perf_test on
 specific CPUs

On Fri, Oct 07, 2022, Vipin Sharma wrote:
> On Fri, Oct 7, 2022 at 10:39 AM Vipin Sharma <vipinsh@...gle.com> wrote:
> >
> > On Thu, Oct 6, 2022 at 5:14 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > >
> > > On Thu, Oct 06, 2022, Vipin Sharma wrote:
> > > > On Thu, Oct 6, 2022 at 12:50 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > > > > > +{
> > > > > > +     cpu_set_t cpuset;
> > > > > > +     int err;
> > > > > > +
> > > > > > +     CPU_ZERO(&cpuset);
> > > > > > +     CPU_SET(pcpu, &cpuset);
> > > > >
> > > > > To save user pain:
> > > > >
> > > > >         r = sched_getaffinity(0, sizeof(allowed_mask), &allowed_mask);
> > > > >         TEST_ASSERT(!r, "sched_getaffinity failed, errno = %d (%s)", errno,
> > > > >                     strerror(errno));
> > > > >
> > > > >         TEST_ASSERT(CPU_ISSET(pcpu, &allowed_mask),
> > > > >                     "Task '%d' not allowed to run on pCPU '%d'\n");
> > > > >
> > > > >         CPU_ZERO(&allowed_mask);
> > > > >         CPU_SET(cpu, &allowed_mask);
> > > > >
> > > > > that way the user will get an explicit error message if they try to pin a vCPU/task
> > > > > that has already been affined by something else.  And then, in theory,
> > > > > sched_setaffinity() should never fail.
> > > > >
> > > > > Or you could have two cpu_set_t objects and use CPU_AND(), but that seems
> > > > > unnecessarily complex.
> > > > >
> > > >
> > > > sched_setaffinity() doesn't fail when we assign more than one task to
> > > > the pCPU, it allows multiple tasks to be on the same pCPU. One of the
> > > > reasons it fails is if it is provided a cpu number which is bigger
> > > > than what is actually available on the host.
> > > >
> > > > I am not convinced that pinning vCPUs to the same pCPU should throw an
> > > > error. We should allow if someone wants to try and compare performance
> > > > by over subscribing or any valid combination they want to test.
> > >
> > > Oh, I'm not talking about the user pinning multiple vCPUs to the same pCPU via
> > > the test, I'm talking about the user, or more likely something in the users's
> > > environment, restricting what pCPUs the user's tasks are allowed on.  E.g. if
> > > the test is run in shell that has been restricted to CPU8 via cgroups, then
> > > sched_setaffinity() will fail if the user tries to pin vCPUs to any other CPU.
> >
> > I see, I will add this validation.
> 
> I think we should drop this check. Current logic is that the new
> function perf_test_setup_pinning() parses the vcpu mappings, stores
> them in perf_test_vcpu_args{} struct and moves the main thread to the
> provided pcpu. But this causes TEST_ASSERT(CPU_ISSET...) to fail for
> vcpu threads when they are created because they inherit task affinity
> from the main thread which has the pcpu set during setup.
> 
> However, this affinity is not strict, so, if TEST_ASSERT(CPU_ISSET...)
> is removed then vcpu threads successfully move to their required pcpu
> via sched_setaffinity() even though the main thread has different
> affinity. If cpus are restricted via cgroups then sched_setaffinity()
> fails as expected no matter what.
> 
> Another option will be to split the API, perf_test_setup_pinning()
> will return the main thread pcpu and dirty_log_perf_test can call
> pin_this_task_to_cpu() with the returned pcpu after vcpus have been
> started. I do not like this approach, I also think
> TEST_ASSERT(CPU_ISSET...) is not reducing user pain that much because
> users can still figure out with returned errno what is happening.

The easy way to handle this is to take the sched_getaffinity() snapshot during
perf_test_setup_pinning().  You could even do the sanity checking there, e.g.
keep pcpu_num() (maybe rename it to parse_pcpu()?)

static uint32_t parse_pcpu(const char *cpu_str, cpu_set_t *allowed_mask)
{
	uint32_t pcpu = atoi_positive(cpu_str);

	TEST_ASSERT(CPU_ISSET(pcpu, &allowed_mask),
		    "Not allowed to run on pCPU '%d', check cgroups?\n");
	return pcpu;
}


	r = sched_getaffinity(0, sizeof(allowed_mask), &allowed_mask);
	TEST_ASSERT(!r, "sched_getaffinity() failed");

	for (i = 0; i < nr_vcpus; i++ {
		TEST_ASSERT(cpu, "pCPU not provided for vCPU%d\n", i);

		perf_test_args.vcpu_args[i++].pcpu = parse_pcpu(cpu, &allowed_mask);
		cpu = strtok(NULL, delim);
	}


	if (cpu)
		pin_me_to_pcpu(parse_pcpu(cpu, &allowed_mask));

That'll result in a slightly larger window where the sanity check could get a
false negative, but that's ok.  Detecting conflicts with 100% accuracy isn't
possible since there's always a window where the allowed cpuset could change, the
goal is only to catch the "obvious" cases in order to save the user a bit of debug
time.