linux-kernel - Re: WARNING: CPU: 0 PID: 1271 at drivers/mmc/core/core.c:991 mmc_release

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9de143ba-2122-3242-0576-e0c9454ba2d3@rock-chips.com>
Date:	Fri, 12 Aug 2016 16:30:12 +0800
From:	Shawn Lin <shawn.lin@...k-chips.com>
To:	Jaehoon Chung <jh80.chung@...sung.com>,
	John Stultz <john.stultz@...aro.org>,
	Ulf Hansson <ulf.hansson@...aro.org>
Cc:	shawn.lin@...k-chips.com, Guodong Xu <guodong.xu@...aro.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Zhangfei Gao <zhangfei.gao@...aro.org>
Subject: Re: WARNING: CPU: 0 PID: 1271 at drivers/mmc/core/core.c:991
 mmc_release_host+0xa0/0xa8

在 2016/8/12 16:01, Jaehoon Chung 写道:
> On 08/12/2016 04:13 PM, John Stultz wrote:
>> On Thu, Aug 4, 2016 at 9:52 PM, John Stultz <john.stultz@...aro.org> wrote:
>>> Hey Ulf,
>>>   Since moving my HiKey branch to pre-v4.8-rc1 (linus's HEAD), I've
>>> been seeing the following warning occasionally. Usually after seeing
>>> it, the system will refuse to reboot (system does the "Emergency
>>> remount complete" but then just sits there, and if I ctrl-c I can use
>>> the shell fine but many commands will get me stuck).
>>>
>>> Anyway, if you have any ideas...
>>>
>>> thanks
>>> -john
>>>
>>> [   24.154245] ------------[ cut here ]------------
>>> [   24.158903] WARNING: CPU: 2 PID: 1273 at
>>> drivers/mmc/core/core.c:991 mmc_release_host+0xa0/0xa8
>>> [   24.167605]
>>> [   24.169104] CPU: 2 PID: 1273 Comm: mmcqd/0 Not tainted
>>> 4.7.0-11945-gb30f1d6-dirty #706
>>> [   24.177024] Hardware name: HiKey Development Board (DT)
>>> [   24.182253] task: ffffffc0793d8c80 task.stack: ffffffc078c48000
>>> [   24.188178] PC is at mmc_release_host+0xa0/0xa8
>>> [   24.192725] LR is at mmc_put_card+0x18/0x3c
>>> [   24.196917] pc : [<ffffff80086c2550>] lr : [<ffffff80086c31f4>]
>>> pstate: 80000145
>>> [   24.204317] sp : ffffffc078c4bd20
>>> [   24.207636] x29: ffffffc078c4bd20 x28: 0000000000000000
>>> [   24.212975] x27: 0000000000000000 x26: ffffffc077903420
>>> [   24.216220] x25: ffffffc078788028 x24: ffffffc0787e8800
>>> [   24.216232] x23: ffffffc078788000 x22: 0000000000000000
>>> [   24.216243] x21: 0000000000000000 x20: ffffffc078788018
>>> [   24.216254] x19: ffffffc0787e8800 x18: 0000000000000000
>>> [   24.216265] x17: 0000000000000000 x16: 0000000000000000
>>> [   24.216276] x15: 0000000000000000 x14: ffffffc078789430
>>> [   24.216288] x13: 000000000000002f x12: 000000000000b853
>>> [   24.216299] x11: ffffffc077903420 x10: 0000000000000860
>>> [   24.216310] x9 : ffffffc078c48000 x8 : ffffffc0793d9540
>>> [   24.216322] x7 : 0000000000d3f8c7 x6 : 0000000000002bd0
>>> [   24.216333] x5 : 00000000021458fa x4 : 00ffffffffffffff
>>> [   24.216344] x3 : 00000000d0555555 x2 : ffffffc078c4bd5c
>>> [   24.216355] x1 : 0000000000000000 x0 : 0000000000000000
>>> [   24.216366]
>>> [   24.216372] ---[ end trace 74dade4766b71d8d ]---
>>> [   24.216377] Call trace:
>>> [   24.216386] Exception stack(0xffffffc078c4bb50 to 0xffffffc078c4bc80)
>>> [   24.216394] bb40:
>>> ffffffc0787e8800 0000008000000000
>>> [   24.216403] bb60: ffffffc078c4bd20 ffffff80086c2550
>>> ffffff8008ca6000 ffffffc0784fb200
>>> [   24.216411] bb80: ffffffc07bf4b7e8 0000000000000008
>>> ffffffc0793d8d00 ffffff8008c82780
>>> [   24.216420] bba0: ffffffc078c4bbe0 ffffff800843576c
>>> ffffffc078c4bbf0 ffffff800843576c
>>> [   24.216429] bbc0: ffffffc078c4bcc0 ffffffc078c4bc78
>>> ffffffc078c4bc10 ffffff800843576c
>>> [   24.216437] bbe0: ffffffc078c4bce0 ffffffc078c4bc98
>>> 0000000000000000 0000000000000000
>>> [   24.216445] bc00: ffffffc078c4bd5c 00000000d0555555
>>> 00ffffffffffffff 00000000021458fa
>>> [   24.216452] bc20: 0000000000002bd0 0000000000d3f8c7
>>> ffffffc0793d9540 ffffffc078c48000
>>> [   24.216460] bc40: 0000000000000860 ffffffc077903420
>>> 000000000000b853 000000000000002f
>>> [   24.216467] bc60: ffffffc078789430 0000000000000000
>>> 0000000000000000 0000000000000000
>>> [   24.216479] [<ffffff80086c2550>] mmc_release_host+0xa0/0xa8
>>> [   24.216486] [<ffffff80086c31f4>] mmc_put_card+0x18/0x3c
>>> [   24.216497] [<ffffff80086d30e4>] mmc_blk_issue_rq+0x11c/0x4a4
>>> [   24.216506] [<ffffff80086d3e44>] mmc_queue_thread+0x98/0x158
>>> [   24.216517] [<ffffff80080cfd7c>] kthread+0xd0/0xe4
>>> [   24.216527] [<ffffff8008082e90>] ret_from_fork+0x10/0x40
>>
>>
>> Hey Ulf,
>>   I *think* I've narrowed this down to
>> 6024e16654c1e1a2475e848d735963d05a12dba9 ("mmc: dw_mmc: set to
>> MMC_CAP_ERASE by default"). Its fairly sporadic so I may be seeing
>> this as a false positive, but after reverting that patch I've
>> seemingly stopped seeing the issue.
>
> Hmm, i don't think so. I *guess* it's not related with commit 6024e16654.
>
> Before calling mmc_put_card(), is it issued the discard request?
>
>         if ((!req && !(mq->flags & MMC_QUEUE_NEW_REQUEST)) ||
>              (cmd_flags & MMC_REQ_SPECIAL_MASK))
>
> Which condition hit?

If special req meets, mrq_pre and mrq_cur are both null after schedule
queue. And for this special req, host->claimed is released. For the
next req peeking from blk, we run into mmc_get_card again which means
we should never meet this WARN when releasing host. So it's
interesting to dig out actually what is happening there...

But at least for dw_mmc-rockchip, we have been using this feature,
ERASE/Trim/discard, for years. I didn't see it ever. Anyway from the
code I was reading, I don't think it should be issue of this commit.

Please look at this regression report I saw.

https://lkml.org/lkml/2016/8/11/130

>
>>
>> Anyway, I'll do some further testing tomorrow w/ that removed. Usually
>> I see the issue 1-2 times an hour, so if I go the day w/o a problem
>> I'll let you know.
>>
>> Zhangfei/Guodong: Any ideas as to why ERASE might cause trouble on HiKey?
>
> Did you try to send the Erase command directly? e,g fstrim or other things?
> Is it occurred on every booting time?
>
> Best Regards,
> Jaehoon Chung
>
>>
>> thanks
>> -john
>>
>>
>>
>
>
>
>


-- 
Best Regards
Shawn Lin