linux-kernel - Re: qemu-system-s390x hang in tcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yt9d7d4zhddq.fsf@linux.ibm.com>
Date:   Wed, 29 Jun 2022 14:18:09 +0200
From:   Sven Schnelle <svens@...ux.ibm.com>
To:     Alex Bennée <alex.bennee@...aro.org>
Cc:     David Hildenbrand <david@...hat.com>,
        Janosch Frank <frankja@...ux.ibm.com>,
        Liam Howlett <liam.howlett@...cle.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Guenter Roeck <linux@...ck-us.net>,
        "maple-tree@...ts.infradead.org" <maple-tree@...ts.infradead.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Yu Zhao <yuzhao@...gle.com>, Juergen Gross <jgross@...e.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Andreas Krebbel <krebbel@...ux.ibm.com>,
        Ilya Leoshkevich <iii@...ux.ibm.com>,
        Thomas Huth <thuth@...hat.com>, richard.henderson@...aro.org,
        qemu-devel@...gnu.org, qemu-s390x@...gnu.org
Subject: Re: qemu-system-s390x hang in tcg

Sven Schnelle <svens@...ux.ibm.com> writes:

> Alex Bennée <alex.bennee@...aro.org> writes:
>
>> Sven Schnelle <svens@...ux.ibm.com> writes:
>>
>>> Hi,
>>>
>>> David Hildenbrand <david@...hat.com> writes:
>>>
>>>> On 04.05.22 09:37, Janosch Frank wrote:
>>>>> I had a short look yesterday and the boot usually hangs in the raid6 
>>>>> code. Disabling vector instructions didn't make a difference but a few 
>>>>> interruptions via GDB solve the problem for some reason.
>>>>> 
>>>>> CCing David and Thomas for TCG
>>>>> 
>>>>
>>>> I somehow recall that KASAN was always disabled under TCG, I might be
>>>> wrong (I thought we'd get a message early during boot that the HW
>>>> doesn't support KASAN).
>>>>
>>>> I recall that raid code is a heavy user of vector instructions.
>>>>
>>>> How can I reproduce? Compile upstream (or -next?) with kasan support and
>>>> run it under TCG?
>>>
>>> I spent some time looking into this. It's usually hanging in
>>> s390vx8_gen_syndrome(). My first thought was that it is a problem with
>>> the VX instructions, but turned out that it hangs even if i remove all
>>> the code from s390vx8_gen_syndrome().
>>>
>>> Tracing the execution of TB's, i see that the generated code is always
>>> jumping between a few TB's, but never exiting the TB's to check for
>>> interrupts (i.e. return to cpu_tb_exec(). I only see calls to
>>> helper_lookup_tb_ptr to lookup the tb pointer for the next TB.
>>>
>>> The raid6 code is waiting for some time to expire by reading jiffies,
>>> but interrupts are never processed and therefore jiffies doesn't change.
>>> So the raid6 code hangs forever.
>>>
>>> As a test, i made a quick change to test:
>>>
>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>>> index c997c2e8e0..35819fd5a7 100644
>>> --- a/accel/tcg/cpu-exec.c
>>> +++ b/accel/tcg/cpu-exec.c
>>> @@ -319,7 +319,8 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
>>>      cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>>>
>>>      cflags = curr_cflags(cpu);
>>> -    if (check_for_breakpoints(cpu, pc, &cflags)) {
>>> +    if (check_for_breakpoints(cpu, pc, &cflags) ||
>>> +        unlikely(qatomic_read(&cpu->interrupt_request))) {
>>>          cpu_loop_exit(cpu);
>>>      }
>>>
>>> And that makes the problem go away. But i'm not familiar with the TCG
>>> internals, so i can't say whether the generated code is incorrect or
>>> something else is wrong. I have tcg log files of a failing + working run
>>> if someone wants to take a look. They are rather large so i would have to
>>> upload them somewhere.
>>
>> Whatever is setting cpu->interrupt_request should be calling
>> cpu_exit(cpu) which sets the exit flag which is checked at the start of
>> every TB execution (see gen_tb_start).
>
> Thanks, that was very helpful. I added debugging and it turned out
> that the TB is left because of a pending irq. The code then calls
> s390_cpu_exec_interrupt:
>
> bool s390_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
> {
>     if (interrupt_request & CPU_INTERRUPT_HARD) {
>         S390CPU *cpu = S390_CPU(cs);
>         CPUS390XState *env = &cpu->env;
>
>         if (env->ex_value) {
>             /* Execution of the target insn is indivisible from
>                the parent EXECUTE insn.  */
>             return false;
>         }
>         if (s390_cpu_has_int(cpu)) {
>             s390_cpu_do_interrupt(cs);
>             return true;
>         }
>         if (env->psw.mask & PSW_MASK_WAIT) {
>             /* Woken up because of a floating interrupt but it has already
>              * been delivered. Go back to sleep. */
>             cpu_interrupt(CPU(cpu), CPU_INTERRUPT_HALT);
>         }
>     }
>     return false;
> }
>
> Note the 'if (env->ex_value) { }' check. It looks like this function
> just returns false in case tcg is executing an EX instruction. After
> that the information that the TB should be exited because of an
> interrupt is gone. So the TB's are never exited again, although the
> interrupt wasn't handled. At least that's my assumption now, if i'm
> wrong please tell me.

Looking at the code i see CF_NOIRQ to prevent TB's from getting
interrupted. But i only see that used in the core tcg code. Would
that be a possibility, or is there something else/better?

Sorry for the dumb questions, i'm not often working on qemu ;-)