[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a77f2266-b654-087c-7af8-78c745a52b37@c-s.fr>
Date: Tue, 26 Nov 2019 09:13:28 +0100
From: Christophe Leroy <christophe.leroy@....fr>
To: "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>
Cc: "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"alistair@...ple.id.au" <alistair@...ple.id.au>,
"chengjian (D)" <cj.chengjian@...wei.com>,
Xiexiuqi <xiexiuqi@...wei.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"oss@...error.net" <oss@...error.net>,
"paulus@...ba.org" <paulus@...ba.org>,
"Libin (Huawei)" <huawei.libin@...wei.com>,
"agust@...x.de" <agust@...x.de>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: 答复: 答复: loop nesting in alignment exception and machine check
Hi,
Le 01/11/2019 à 02:57, Wangshaobo (bobo) a écrit :
> Hi, Christophe
>
> I am sorry that we are in some troubles for some unpredictable problems when we replay and haven't given you a quick reply.
>
> I also want to ask does the phenomeon(use memcpy_toio when copy ioremap_address) only occurs in powerpc ? does any other
> arch also has the same problem ? we are in persuit of asking why this phenomenon happened. Our linux kernel version is 4.4.
It's not a problem ... it's a feature.
I have no idea whether the same kind of issue can happen on other
arches, sorry.
Christophe
>
> thanks very much.
>
> -----邮件原件-----
> 发件人: Christophe Leroy [mailto:christophe.leroy@....fr]
> 发送时间: 2019年10月31日 19:13
> 收件人: Wangshaobo (bobo) <bobo.shaobowang@...wei.com>
> 抄送: chengjian (D) <cj.chengjian@...wei.com>; Libin (Huawei) <huawei.libin@...wei.com>; Xiexiuqi <xiexiuqi@...wei.com>; zhangyi (F) <yi.zhang@...wei.com>
> 主题: Re: 答复: loop nesting in alignment exception and machine check
>
> Hi,
>
> Did you try ? Does it work ?
>
> Christophe
>
> Le 28/10/2019 à 06:57, Wangshaobo (bobo) a écrit :
>> Hi,Christophe
>>
>> Thank you for your quick reply. I will try to use memcpy_toio() instead of memcpy().
>>
>> -----邮件原件-----
>> 发件人: Christophe Leroy [mailto:christophe.leroy@....fr]
>> 发送时间: 2019年10月26日 19:20
>> 收件人: Wangshaobo (bobo) <bobo.shaobowang@...wei.com>
>> 抄送: linux-arch@...r.kernel.org; alistair@...ple.id.au; chengjian (D)
>> <cj.chengjian@...wei.com>; Xiexiuqi <xiexiuqi@...wei.com>;
>> linux-kernel@...r.kernel.org; oss@...error.net; paulus@...ba.org;
>> Libin (Huawei) <huawei.libin@...wei.com>; agust@...x.de;
>> linuxppc-dev@...ts.ozlabs.org
>> 主题: Re: loop nesting in alignment exception and machine check
>>
>> Hi,
>>
>> Le 26/10/2019 à 09:23, Wangshaobo (bobo) a écrit :
>>> Hi,
>>>
>>> I encountered a problem about a loop nesting occurred in
>>> manufacturing the alignment exception in machine check, trigger background is :
>>>
>>> problem:
>>>
>>> machine checkout or critical interrupt ->…->kbox_write[for recording
>>> last words] -> memcpy(irremap_addr, src,size):_GLOBAL(memcpy)…
>>>
>>> when we enter memcpy,a command ‘dcbz r11,r6’ will cause a alignment
>>> exception, in this situation,r11 loads the ioremap address,which
>>> leads to the alignment exception,
>>
>> You can't use memcpy() on something else than memory.
>>
>> For an ioremapped area, you have to use memcpy_toio()
>>
>> Christophe
>>
>>>
>>> then the command can not be process successfully,as we still in
>>> machine check.at the end ,it triggers a new irq machine check in irq
>>> handler function,a loop nesting begins.
>>>
>>> analysis:
>>>
>>> We have analysed a lot,but it still can not come to a reasonable
>>> description,in common,the alignment triggered in machine check
>>> context can still be collected into the Kbox
>>>
>>> after alignment exception be handled by handler function, but how
>>> does the machine checkout can be triggered in the handler fucntion
>>> for any causes? We print relevant registers
>>>
>>> as follow when first enter machine check and alignment exception
>>> handler
>>> function:
>>>
>>> MSR:0x2 MSR:0x0
>>>
>>> SRR1:0x2 SRR1:0x21002
>>>
>>> But the manual says SRR1 should be set to MSR(0x2),why
>>> that happened ?
>>>
>>> Then a branch in handler function copy the SRR1 to
>>> MSR,this enble MSR[ME] and MSR[CE],system collapses.
>>>
>>> Conclusion:
>>>
>>> 1) why the alignment exception can not be handled in
>>> machine check ?
>>>
>>> 2) besides memcpy,any other function can cause the
>>> alignment exception ?
>>>
>>> We still recurrent it, the line as follows:
>>>
>>> Cpu dead lock->watch log->trigger
>>> fiq->kbox_write->memcpy->alignment exception->print last words.
>>>
>>> but for those problems as below,what the kbox printed is empty.
>>>
>>> ------------------kbox restart:[ 10.147594]----------------
>>>
>>> kbox verify fs magic fail
>>>
>>> kbox mem mabye destroyed, format it
>>>
>>> kbox: load OK
>>>
>>> lock-task: major[249] minor[0]
>>>
>>> -----start show_destroyed_kbox_mem_head----
>>>
>>> 00000000: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000010: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000020: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000030: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000040: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000050: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000060: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000070: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000080: 00000000 00000000 00000000 00000000 ................
>>>
>>> 00000090: 00000000 00000000 00000000 00000000 ................
>>>
Powered by blists - more mailing lists