lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc65b542-9703-aa04-f1c7-b10584e44391@gmail.com>
Date:   Wed, 21 Feb 2018 23:50:01 +0300
From:   Dmitry Osipenko <digetx@...il.com>
To:     Adrian Hunter <adrian.hunter@...el.com>,
        Ulf Hansson <ulf.hansson@...aro.org>
Cc:     linux-mmc <linux-mmc@...r.kernel.org>,
        linux-block <linux-block@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Bough Chen <haibo.chen@....com>,
        Alex Lemberg <alex.lemberg@...disk.com>,
        Mateusz Nowak <mateusz.nowak@...el.com>,
        Yuliy Izrailov <Yuliy.Izrailov@...disk.com>,
        Jaehoon Chung <jh80.chung@...sung.com>,
        Dong Aisheng <dongas86@...il.com>,
        Das Asutosh <asutoshd@...eaurora.org>,
        Zhangfei Gao <zhangfei.gao@...il.com>,
        Sahitya Tummala <stummala@...eaurora.org>,
        Harjani Ritesh <riteshh@...eaurora.org>,
        Venu Byravarasu <vbyravarasu@...dia.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        Shawn Lin <shawn.lin@...k-chips.com>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCH V15 06/22] mmc: block: Add blk-mq support

On 29.11.2017 16:41, Adrian Hunter wrote:
> Define and use a blk-mq queue. Discards and flushes are processed
> synchronously, but reads and writes asynchronously. In order to support
> slow DMA unmapping, DMA unmapping is not done until after the next request
> is started. That means the request is not completed until then. If there is
> no next request then the completion is done by queued work.

Hello,

I'm using (running linux-next and doing some upstream development for) some old
NVIDIA Tegra tablet that has built-in (internal) and external MMC's and with the
blk-mq being enabled I'm observing a soft lockup. The lockup seems is
reproducible quite reliably by running fsck on any MMC partition, sometimes
kernels lockups on boot during probing partitions table (weirdly only when both
SDHCI's are present, i.e. internal storage enabled in DT and external SD is
inserted/enabled) and it also lockups pretty quickly in a case of just a general
use. Reverting mmc/ commits up to 1bec43a3b18 ("Remove option not to use
blk-mq") and disabling CONFIG_MMC_MQ_DEFAULT makes everything working fine
again. There is also a third SDHCI populated with built-in WiFi/Bluetooth SDIO
and I'm observing odd MMC timeouts with the blk-mq enabled, disabling
CONFIG_MMC_MQ_DEFAULT fixes these timeouts as well.

Any thoughts?

WiFi issue
========================

[   38.247006] mmc2: Timeout waiting for hardware interrupt.
[   38.247027] brcmfmac: brcmf_escan_timeout: timer expired
[   38.247036] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[   38.247047] mmc2: sdhci: Sys addr:  0x00000000 | Version:  0x00000001
[   38.247055] mmc2: sdhci: Blk size:  0x00007008 | Blk cnt:  0x00000000
[   38.247062] mmc2: sdhci: Argument:  0x21000008 | Trn mode: 0x00000013
[   38.247070] mmc2: sdhci: Present:   0x01d70000 | Host ctl: 0x00000013
[   38.247077] mmc2: sdhci: Power:     0x00000001 | Blk gap:  0x00000000
[   38.247084] mmc2: sdhci: Wake-up:   0x00000000 | Clock:    0x00000007
[   38.247091] mmc2: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
[   38.247098] mmc2: sdhci: Int enab:  0x02ff000b | Sig enab: 0x02fc000b
[   38.247105] mmc2: sdhci: AC12 err:  0x00000000 | Slot int: 0x00000000
[   38.247112] mmc2: sdhci: Caps:      0x61ff30b0 | Caps_1:   0x00000000
[   38.247119] mmc2: sdhci: Cmd:       0x0000353a | Max curr: 0x00000001
[   38.247126] mmc2: sdhci: Resp[0]:   0x00001800 | Resp[1]:  0x08002db5
[   38.247133] mmc2: sdhci: Resp[2]:   0x16da8000 | Resp[3]:  0x00000400
[   38.247139] mmc2: sdhci: Host ctl2: 0x00000000
[   38.247146] mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x17c47200
[   38.247152] mmc2: sdhci: ============================================
[   38.247250] brcmfmac: brcmf_sdio_readframes: read 520 bytes from channel 1
failed: -84
[   38.247274] brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
[   40.807019] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[   40.807042] brcmfmac: brcmf_notify_escan_complete: Scan abort failed
[   48.487007] mmc2: Timeout waiting for hardware interrupt.
[   48.487057] mmc2: sdhci: ============ SDHCI REGISTER DUMP ===========
[   48.487096] mmc2: sdhci: Sys addr:  0x00000000 | Version:  0x00000001
[   48.487128] mmc2: sdhci: Blk size:  0x00007040 | Blk cnt:  0x00000001
[   48.487160] mmc2: sdhci: Argument:  0x21000040 | Trn mode: 0x00000013
[   48.487191] mmc2: sdhci: Present:   0x01d70000 | Host ctl: 0x00000013
[   48.487221] mmc2: sdhci: Power:     0x00000001 | Blk gap:  0x00000000
[   48.487251] mmc2: sdhci: Wake-up:   0x00000000 | Clock:    0x00000007
[   48.487281] mmc2: sdhci: Timeout:   0x0000000e | Int stat: 0x00000000
[   48.487313] mmc2: sdhci: Int enab:  0x02ff000b | Sig enab: 0x02fc000b
[   48.487343] mmc2: sdhci: AC12 err:  0x00000000 | Slot int: 0x00000000
[   48.487374] mmc2: sdhci: Caps:      0x61ff30b0 | Caps_1:   0x00000000
[   48.487404] mmc2: sdhci: Cmd:       0x0000353a | Max curr: 0x00000001
[   48.487435] mmc2: sdhci: Resp[0]:   0x00001000 | Resp[1]:  0x08002db5
[   48.487466] mmc2: sdhci: Resp[2]:   0x16da8000 | Resp[3]:  0x00000400
[   48.487493] mmc2: sdhci: Host ctl2: 0x00000000
[   48.487525] mmc2: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x17c47200
[   48.487552] mmc2: sdhci: ============================================
[   48.487749] brcmfmac: brcmf_sdio_readframes: read 480 bytes from channel 1
failed: -84
[   48.487822] brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK


Soft lockup issue
========================

# fsck -f /dev/disk/by-uuid/6768309f-3545-49d5-9ac7-d5be24d35ef2
fsck из util-linux 2.30.2
e2fsck 1.43.9 (8-Feb-2018)
Проход 1: Проверка inodes, блокs, а также размеров
Pass 2: Checking каталог structure
Pass 3: Checking каталог connectivity
Pass 4: Checking reference counts
Pass 5: Checking группа summary information
....

[  125.925436] INFO: task kworker/0:3H:263 blocked for more than 60 seconds.
[  125.925496]       Not tainted
4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041
[  125.925530] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  125.925562] kworker/0:3H    D    0   263      2 0x00000000
[  125.925653] Workqueue: kblockd mmc_blk_mq_complete_work
[  125.925747] [<c0b991cc>] (__schedule) from [<c0b998f0>] (schedule+0x60/0xcc)
[  125.925805] [<c0b998f0>] (schedule) from [<c086c870>]
(__mmc_claim_host+0xdc/0x24c)
[  125.925849] [<c086c870>] (__mmc_claim_host) from [<c08750bc>]
(mmc_start_bkops+0x4c/0x190)
[  125.925895] [<c08750bc>] (mmc_start_bkops) from [<c087d254>]
(mmc_blk_urgent_bkops+0x48/0x5c)
[  125.925945] [<c087d254>] (mmc_blk_urgent_bkops) from [<c087f2e0>]
(mmc_blk_mq_complete_prev_req.part.5+0x74/0x210)
[  125.925995] [<c087f2e0>] (mmc_blk_mq_complete_prev_req.part.5) from
[<c08813d0>] (mmc_blk_mq_complete_work+0x30/0x34)
[  125.926049] [<c08813d0>] (mmc_blk_mq_complete_work) from [<c014306c>]
(process_one_work+0x1f8/0x584)
[  125.926093] [<c014306c>] (process_one_work) from [<c014402c>]
(worker_thread+0x68/0x5d4)
[  125.926144] [<c014402c>] (worker_thread) from [<c014965c>] (kthread+0x178/0x184)
[  125.926188] [<c014965c>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  125.926209] Exception stack(0xd579ffb0 to 0xd579fff8)
[  125.926239] ffa0:                                     00000000 00000000
00000000 00000000
[  125.926272] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
[  125.926301] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  125.926361] INFO: task fsck.ext4:471 blocked for more than 60 seconds.
[  125.926399]       Not tainted
4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041
[  125.926427] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  125.926455] fsck.ext4       D    0   471    470 0x00000000
[  125.926536] [<c0b991cc>] (__schedule) from [<c0b998f0>] (schedule+0x60/0xcc)
[  125.926593] [<c0b998f0>] (schedule) from [<c015823c>] (io_schedule+0x20/0x40)
[  125.926654] [<c015823c>] (io_schedule) from [<c0220acc>]
(wait_on_page_bit+0x120/0x144)
[  125.926705] [<c0220acc>] (wait_on_page_bit) from [<c0220bc4>]
(__filemap_fdatawait_range+0xd4/0x114)
[  125.926757] [<c0220bc4>] (__filemap_fdatawait_range) from [<c0223784>]
(file_write_and_wait_range+0x98/0xb4)
[  125.926810] [<c0223784>] (file_write_and_wait_range) from [<c02c0300>]
(blkdev_fsync+0x2c/0x5c)
[  125.926867] [<c02c0300>] (blkdev_fsync) from [<c02b7ce8>]
(vfs_fsync_range+0x4c/0xb0)
[  125.926912] [<c02b7ce8>] (vfs_fsync_range) from [<c02b7dd4>] (do_fsync+0x4c/0x74)
[  125.926954] [<c02b7dd4>] (do_fsync) from [<c02b80a0>] (SyS_fsync+0x1c/0x20)
[  125.926996] [<c02b80a0>] (SyS_fsync) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[  125.927016] Exception stack(0xd5631fa8 to 0xd5631ff0)
[  125.927050] 1fa0:                   00480fc8 00481108 00000004 00481108
00000000 00000000
[  125.927086] 1fc0: 00480fc8 00481108 00000000 00000076 00000000 7f2bb750
00483268 bec602e8
[  125.927113] 1fe0: 00000076 bec60288 b6de5e8b b6d67cf6
[  177.015618] random: crng init done
[  187.365434] INFO: task kworker/0:3H:263 blocked for more than 60 seconds.
[  187.365491]       Not tainted
4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041
[  187.365524] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  187.365556] kworker/0:3H    D    0   263      2 0x00000000
[  187.365647] Workqueue: kblockd mmc_blk_mq_complete_work
[  187.365741] [<c0b991cc>] (__schedule) from [<c0b998f0>] (schedule+0x60/0xcc)
[  187.365798] [<c0b998f0>] (schedule) from [<c086c870>]
(__mmc_claim_host+0xdc/0x24c)
[  187.365842] [<c086c870>] (__mmc_claim_host) from [<c08750bc>]
(mmc_start_bkops+0x4c/0x190)
[  187.365887] [<c08750bc>] (mmc_start_bkops) from [<c087d254>]
(mmc_blk_urgent_bkops+0x48/0x5c)
[  187.365936] [<c087d254>] (mmc_blk_urgent_bkops) from [<c087f2e0>]
(mmc_blk_mq_complete_prev_req.part.5+0x74/0x210)
[  187.365986] [<c087f2e0>] (mmc_blk_mq_complete_prev_req.part.5) from
[<c08813d0>] (mmc_blk_mq_complete_work+0x30/0x34)
[  187.366039] [<c08813d0>] (mmc_blk_mq_complete_work) from [<c014306c>]
(process_one_work+0x1f8/0x584)
[  187.366083] [<c014306c>] (process_one_work) from [<c014402c>]
(worker_thread+0x68/0x5d4)
[  187.366134] [<c014402c>] (worker_thread) from [<c014965c>] (kthread+0x178/0x184)
[  187.366178] [<c014965c>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[  187.366200] Exception stack(0xd579ffb0 to 0xd579fff8)
[  187.366229] ffa0:                                     00000000 00000000
00000000 00000000
[  187.366262] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
[  187.366291] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  187.366350] INFO: task fsck.ext4:471 blocked for more than 60 seconds.
[  187.366388]       Not tainted
4.16.0-rc2-next-20180220-00101-gaefde91c3955-dirty #2041
[  187.366416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[  187.366445] fsck.ext4       D    0   471    470 0x00000000
[  187.366526] [<c0b991cc>] (__schedule) from [<c0b998f0>] (schedule+0x60/0xcc)
[  187.366582] [<c0b998f0>] (schedule) from [<c015823c>] (io_schedule+0x20/0x40)
[  187.366644] [<c015823c>] (io_schedule) from [<c0220acc>]
(wait_on_page_bit+0x120/0x144)
[  187.366693] [<c0220acc>] (wait_on_page_bit) from [<c0220bc4>]
(__filemap_fdatawait_range+0xd4/0x114)
[  187.366745] [<c0220bc4>] (__filemap_fdatawait_range) from [<c0223784>]
(file_write_and_wait_range+0x98/0xb4)
[  187.366799] [<c0223784>] (file_write_and_wait_range) from [<c02c0300>]
(blkdev_fsync+0x2c/0x5c)
[  187.366857] [<c02c0300>] (blkdev_fsync) from [<c02b7ce8>]
(vfs_fsync_range+0x4c/0xb0)
[  187.366902] [<c02b7ce8>] (vfs_fsync_range) from [<c02b7dd4>] (do_fsync+0x4c/0x74)
[  187.366944] [<c02b7dd4>] (do_fsync) from [<c02b80a0>] (SyS_fsync+0x1c/0x20)
[  187.366987] [<c02b80a0>] (SyS_fsync) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[  187.367008] Exception stack(0xd5631fa8 to 0xd5631ff0)
[  187.367041] 1fa0:                   00480fc8 00481108 00000004 00481108
00000000 00000000
[  187.367078] 1fc0: 00480fc8 00481108 00000000 00000076 00000000 7f2bb750
00483268 bec602e8
[  187.367103] 1fe0: 00000076 bec60288 b6de5e8b b6d67cf6

-- 
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ