linux-kernel - Re: qemu-x86_64 booting with 8.0.0 stil see int3: when running LTP tracing testing.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c4b9f02f-3f6a-74b4-4e0d-3da314f90aa8@linaro.org>
Date:   Thu, 6 Jul 2023 07:30:50 +0100
From:   Richard Henderson <richard.henderson@...aro.org>
To:     "Richard W.M. Jones" <rjones@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Arnd Bergmann <arnd@...db.de>,
        Naresh Kamboju <naresh.kamboju@...aro.org>,
        Anders Roxell <anders.roxell@...aro.org>,
        Daniel Díaz <daniel.diaz@...aro.org>,
        Benjamin Copeland <ben.copeland@...aro.org>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: qemu-x86_64 booting with 8.0.0 stil see int3: when running LTP
 tracing testing.

On 7/5/23 22:50, Richard W.M. Jones wrote:
> tb_invalidate_phys_range_fast() *is* called, and we end up calling
>    tb_invalidate_phys_page_range__locked ->
>      tb_phys_invalidate__locked ->
>        do_tb_phys_invalidate
> 
> Nevertheless the old TB (containing the call to the int3 helper) is
> still called after the code has been replaced with a NOP.
> 
> Of course there are 4 MTTCG threads so maybe another thread is in the
> middle of executing the same TB when it gets invalidated.

Yes.

> tb_invalidate_phys_page_range__locked goes to some effort to check if
> the current TB is being invalidated and restart the TB, but as far as
> I can see the test can only work for the current core, and won't
> restart the TB on other cores.

Yes.

The assumption with any of these sorts of races is that it is "as if" the other thread has 
already passed the location of the write within that block.  But by the time this thread 
has finished do_tb_phys_invalidate, any other thread cannot execute the same block *again*.

There's a race here, and now that I think about it, there's been mail about it in the past:

https://lore.kernel.org/qemu-devel/cebad06c-48f2-6dbd-6d7f-3a3cf5aebbe3@linaro.org/

We take care of the same race for user-only in translator_access, by ensuring that each 
translated page is read-only *before* performing the read for translation.  But for system 
mode we grab the page locks *after* the reads.  Which means there's a race.

The email above describes the race pretty clearly, with a new TB being generated before 
the write is even complete.

It'll be non-trivial fixing this, because not only do we need to grab the lock earlier, 
there are ordering issues for a TB that spans two pages, in that one must grab the two 
locks in the correct order lest we deadlock.

r~