linux-kernel - Re: [PATCH] KVM: selftests: Double check on the current CPU in rseq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be806f9c-861a-8da8-d42e-1d4271c3a326@redhat.com>
Date:   Fri, 15 Jul 2022 12:21:42 +1000
From:   Gavin Shan <gshan@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Cc:     kvmarm@...ts.cs.columbia.edu, kvm@...r.kernel.org,
        linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
        mathieu.desnoyers@...icios.com, shuah@...nel.org, maz@...nel.org,
        oliver.upton@...ux.dev, shan.gavin@...il.com
Subject: Re: [PATCH] KVM: selftests: Double check on the current CPU in
 rseq_test

Hi Paolo and Sean,

On 7/15/22 1:35 AM, Sean Christopherson wrote:
> On Thu, Jul 14, 2022, Paolo Bonzini wrote:
>> On 7/14/22 10:06, Gavin Shan wrote:
>>> In rseq_test, there are two threads created. Those two threads are
>>> 'main' and 'migration_thread' separately. We also have the assumption
>>> that non-migration status on 'migration-worker' thread guarantees the
>>> same non-migration status on 'main' thread. Unfortunately, the assumption
>>> isn't true. The 'main' thread can be migrated from one CPU to another
>>> one between the calls to sched_getcpu() and READ_ONCE(__rseq.cpu_id).
>>> The following assert is raised eventually because of the mismatched
>>> CPU numbers.
>>>
>>> The issue can be reproduced on arm64 system occasionally.
>>
>> Hmm, this does not seem a correct patch - the threads are already
>> synchronizing using seq_cnt, like this:
>>
>> 	migration			main
>> 	----------------------		--------------------------------
>> 	seq_cnt = 1
>> 	smp_wmb()
>> 					snapshot = 0
>> 					smp_rmb()
>> 					cpu = sched_getcpu() reads 23
>> 	sched_setaffinity()
>> 					rseq_cpu = __rseq.cpuid reads 35
>> 					smp_rmb()
>> 					snapshot != seq_cnt -> retry
>> 	smp_wmb()
>> 	seq_cnt = 2
>>
>> sched_setaffinity() is guaranteed to block until the task is enqueued on an
>> allowed CPU.
> 
> Yes, and retrying could suppress detection of kernel bugs that this test is intended
> to catch.
> 

Well, I don't think migration_worker() does correct thing, if I'm understanding
correctly. The intention seems to force migration on 'main' thread by 'migration'
thread?  If that is the case, I don't think the following function call has correct
parameters.

     r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask);

     it should be something like:

     r = sched_setaffinity(getpid(), sizeof(allowed_mask), &allowed_mask);

If we're using sched_setaffinity(0, ...) in the 'migration' thread, the CPU
affinity of 'main' thread won't be affected. It means 'main' thread can be
migrated from one CPU to another at any time, even in the following point:

     int main(...)
     {
           :
           /*
            * migration can happen immediately after sched_getcpu(). If
            * CPU affinity of 'main' thread is sticky to one particular
            * CPU, which 'migration' thread supposes to do, then there
            * should have no migration.
            */
           cpu = sched_getcpu();
           rseq_cpu = READ_ONCE(__rseq.cpu_id);
           :
     }

So I think the correct fix is to have sched_setaffinity(getpid(), ...) ?
Please refer to the manpage.

    https://man7.org/linux/man-pages/man2/sched_setaffinity.2.html
    'If pid is zero, then the calling thread is used'

>> Can you check that smp_rmb() and smp_wmb() generate correct instructions on
>> arm64?
> 
> That seems like the most likely scenario (or a kernel bug), I distinctly remember
> the barriers provided by tools/ being rather bizarre.
> 

I don't see any problems for smp_rmb() and smp_wmb() in my case. They have
been translated to correct instructions, as expected.

#define smp_mb()        asm volatile("dmb ish" ::: "memory")
#define smp_wmb()       asm volatile("dmb ishst" ::: "memory")
#define smp_rmb()       asm volatile("dmb ishld" ::: "memory")

--------------

One more experiment for sched_setaffinity(). I run the following program,
the CPU affinity of 'main' thread isn't changed, until the correct
parameter is used, to have sched_setaffinity(getpid(), ...).

sched_setaffinity(0, ...)
-------------------------
[root@...tlab-arm01 tmp]# ./a
thread_func: cpu=0
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
thread_func: cpu=1
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
main: mask=0x000000ff
   :

sched_setaffinity(getpid(), ...)
--------------------------------
thread_func: cpu=198
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
main: mask=0x00000001
thread_func: cpu=198
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
main: mask=0x00000002
   :

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
#include <sched.h>

#define NR_CPUS	8
static int thread_exit = 0;

static void *thread_func(void *data)
{
	cpu_set_t allowed_mask;
	int ret, i;

	for (i = 0; i < NR_CPUS; i++) {
		CPU_ZERO(&allowed_mask);
		CPU_SET(i, &allowed_mask);
#if 1
		sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask);
#else
                 sched_setaffinity(getpid(), sizeof(allowed_mask), &allowed_mask);
#endif
		fprintf(stdout, "%s: cpu=%d\n", __func__, sched_getcpu());

		sleep(1);
	}

	thread_exit = 1;
	return NULL;
}

int main(int argc, char **argv)
{
	pthread_t thread;
	cpu_set_t allowed_mask;
	int mask, i, count = 0;

	pthread_create(&thread, NULL, thread_func, NULL);

	while (!thread_exit) {
		usleep(100000);

		mask = 0;
		sched_getaffinity(0, sizeof(allowed_mask), &allowed_mask);
		for (i = 0; i < NR_CPUS; i++) {
			if (CPU_ISSET(i, &allowed_mask))
				mask |= (1 << i);
		}

		fprintf(stdout, "%s: mask=0x%08x\n", __func__, mask);
	}

	return 0;
}


Thanks,
Gavin