lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ae6c6ad2-8211-6227-fa41-505ecc7df673@redhat.com>
Date:   Sun, 17 Jul 2022 00:46:08 +1000
From:   Gavin Shan <gshan@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvmarm@...ts.cs.columbia.edu,
        kvm@...r.kernel.org, linux-kselftest@...r.kernel.org,
        linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
        shuah@...nel.org, maz@...nel.org, oliver.upton@...ux.dev,
        shan.gavin@...il.com
Subject: Re: [PATCH] KVM: selftests: Double check on the current CPU in
 rseq_test

Hi Sean,

On 7/16/22 12:32 AM, Sean Christopherson wrote:
> On Fri, Jul 15, 2022, Gavin Shan wrote:
>> On 7/15/22 1:35 AM, Sean Christopherson wrote:
>>> On Thu, Jul 14, 2022, Paolo Bonzini wrote:
>> Well, I don't think migration_worker() does correct thing, if I'm understanding
>> correctly. The intention seems to force migration on 'main' thread by 'migration'
>> thread?  If that is the case, I don't think the following function call has correct
>> parameters.
>>
>>      r = sched_setaffinity(0, sizeof(allowed_mask), &allowed_mask);
>>
>>      it should be something like:
>>
>>      r = sched_setaffinity(getpid(), sizeof(allowed_mask), &allowed_mask);
>>
>> If we're using sched_setaffinity(0, ...) in the 'migration' thread, the CPU
>> affinity of 'main' thread won't be affected. It means 'main' thread can be
>> migrated from one CPU to another at any time, even in the following point:
>>
>>      int main(...)
>>      {
>>            :
>>            /*
>>             * migration can happen immediately after sched_getcpu(). If
>>             * CPU affinity of 'main' thread is sticky to one particular
>>             * CPU, which 'migration' thread supposes to do, then there
>>             * should have no migration.
>>             */
>>            cpu = sched_getcpu();
>>            rseq_cpu = READ_ONCE(__rseq.cpu_id);
>>            :
>>      }
>>
>> So I think the correct fix is to have sched_setaffinity(getpid(), ...) ?
>> Please refer to the manpage.
>>
>>     https://man7.org/linux/man-pages/man2/sched_setaffinity.2.html
>>     'If pid is zero, then the calling thread is used'
> 
> Oof, and more explicitly the rest of that sentence clarifies that the result of
> getpid() will target the main thread (I assume "main" means thread group leader).
> 
>     Specifying pid as 0 will set the attribute for the calling thread, and passing
>     the value returned from a call to getpid(2) will set the attribute for the main
>     thread of the thread group.
> 
> I'm guessing my test worked (in that it reproduced the bug) by virtue of the
> scheduler trying to colocate all threads in the process.
> 
> In my defense, the die.net copy of the manpages quite clearly uses "process"[1],
> but that was fixed in the manpages in 2013[2]!?!!?  So I guess the takeaway is
> to use only the official manpages.
> 
> Anyways, for the code, my preference would be to snapshot gettid() in main() before
> spawning the migration worker.  Same result, but I would rather the test explicitly
> target the thread doing rseq instead of relying on (a) getpid() targeting only the
> main thread and (b) the main thread always being the rseq thread.  E.g. if for some
> reason a future patch moves the rseq code to its own worker thread, then getpid()
> would be incorrect.
> 
> Thanks for figuring this out!
> 
> [1] https://linux.die.net/man/2/sched_setaffinity
> [2] 6a7fcf3cc ("sched_setaffinity.2: Clarify that these system calls affect a per-thread attribute")
> 

Thanks for your confirm. The suggested way, to cache tid of the thread group
leader in advance, makes sense to me. The code has been modified accordingly
in below patch, which was just posted. Please help to review when you get a
chance.

[PATCH v2] KVM: selftests: Fix target thread to be migrated in rseq_test

Thanks,
Gavin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ