[<prev] [next>] [day] [month] [year] [list]
Message-ID: <8215aeb3-57dd-223a-29d3-45ca22b0543c@c-s.fr>
Date: Sat, 26 Oct 2019 13:20:06 +0200
From: Christophe Leroy <christophe.leroy@....fr>
To: "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>
Cc: "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"alistair@...ple.id.au" <alistair@...ple.id.au>,
"chengjian (D)" <cj.chengjian@...wei.com>,
Xiexiuqi <xiexiuqi@...wei.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"oss@...error.net" <oss@...error.net>,
"paulus@...ba.org" <paulus@...ba.org>,
"Libin (Huawei)" <huawei.libin@...wei.com>,
"agust@...x.de" <agust@...x.de>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: loop nesting in alignment exception and machine check
Hi,
Le 26/10/2019 à 09:23, Wangshaobo (bobo) a écrit :
> Hi,
>
> I encountered a problem about a loop nesting occurred in manufacturing
> the alignment exception in machine check, trigger background is :
>
> problem:
>
> machine checkout or critical interrupt ->…->kbox_write[for recording
> last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)…
>
> when we enter memcpy,a command ‘dcbz r11,r6’ will cause a alignment
> exception, in this situation,r11 loads the ioremap address,which leads
> to the alignment exception,
You can't use memcpy() on something else than memory.
For an ioremapped area, you have to use memcpy_toio()
Christophe
>
> then the command can not be process successfully,as we still in machine
> check.at the end ,it triggers a new irq machine check in irq handler
> function,a loop nesting begins.
>
> analysis:
>
> We have analysed a lot,but it still can not come to a reasonable
> description,in common,the alignment triggered in machine check context
> can still be collected into the Kbox
>
> after alignment exception be handled by handler function, but how does
> the machine checkout can be triggered in the handler fucntion for any
> causes? We print relevant registers
>
> as follow when first enter machine check and alignment exception handler
> function:
>
> MSR:0x2 MSR:0x0
>
> SRR1:0x2 SRR1:0x21002
>
> But the manual says SRR1 should be set to MSR(0x2),why that
> happened ?
>
> Then a branch in handler function copy the SRR1 to MSR,this
> enble MSR[ME] and MSR[CE],system collapses.
>
> Conclusion:
>
> 1) why the alignment exception can not be handled in machine
> check ?
>
> 2) besides memcpy,any other function can cause the alignment
> exception ?
>
> We still recurrent it, the line as follows:
>
> Cpu dead lock->watch log->trigger
> fiq->kbox_write->memcpy->alignment exception->print last words.
>
> but for those problems as below,what the kbox printed is empty.
>
> ------------------kbox restart:[ 10.147594]----------------
>
> kbox verify fs magic fail
>
> kbox mem mabye destroyed, format it
>
> kbox: load OK
>
> lock-task: major[249] minor[0]
>
> -----start show_destroyed_kbox_mem_head----
>
> 00000000: 00000000 00000000 00000000 00000000 ................
>
> 00000010: 00000000 00000000 00000000 00000000 ................
>
> 00000020: 00000000 00000000 00000000 00000000 ................
>
> 00000030: 00000000 00000000 00000000 00000000 ................
>
> 00000040: 00000000 00000000 00000000 00000000 ................
>
> 00000050: 00000000 00000000 00000000 00000000 ................
>
> 00000060: 00000000 00000000 00000000 00000000 ................
>
> 00000070: 00000000 00000000 00000000 00000000 ................
>
> 00000080: 00000000 00000000 00000000 00000000 ................
>
> 00000090: 00000000 00000000 00000000 00000000 ................
>
Powered by blists - more mailing lists