[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9269fbbf-b5dd-6be1-682f-e791847ea00d@samsung.com>
Date: Thu, 21 Feb 2019 11:22:39 +0100
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Theodore Ts'o <tytso@....edu>, Omar Sandoval <osandov@...com>,
Sagi Grimberg <sagi@...mberg.me>,
Dave Chinner <dchinner@...hat.com>,
Kent Overstreet <kent.overstreet@...il.com>,
Mike Snitzer <snitzer@...hat.com>, dm-devel@...hat.com,
Alexander Viro <viro@...iv.linux.org.uk>,
linux-fsdevel@...r.kernel.org, linux-raid@...r.kernel.org,
David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
"Darrick J . Wong" <darrick.wong@...cle.com>,
linux-xfs@...r.kernel.org, Gao Xiang <gaoxiang25@...wei.com>,
Christoph Hellwig <hch@....de>, linux-ext4@...r.kernel.org,
Coly Li <colyli@...e.de>, linux-bcache@...r.kernel.org,
Boaz Harrosh <ooo@...ctrozaur.com>,
Bob Peterson <rpeterso@...hat.com>, cluster-devel@...hat.com,
Ulf Hansson <ulf.hansson@...aro.org>,
"linux-mmc@...r.kernel.org" <linux-mmc@...r.kernel.org>,
'Linux Samsung SOC' <linux-samsung-soc@...r.kernel.org>,
Krzysztof Kozlowski <krzk@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
Subject: Re: [PATCH V15 14/18] block: enable multipage bvecs
Hi Ming,
On 2019-02-21 11:16, Ming Lei wrote:
> On Thu, Feb 21, 2019 at 11:08:19AM +0100, Marek Szyprowski wrote:
>> On 2019-02-21 10:57, Ming Lei wrote:
>>> On Thu, Feb 21, 2019 at 09:42:59AM +0100, Marek Szyprowski wrote:
>>>> On 2019-02-15 12:13, Ming Lei wrote:
>>>>> This patch pulls the trigger for multi-page bvecs.
>>>>>
>>>>> Reviewed-by: Omar Sandoval <osandov@...com>
>>>>> Signed-off-by: Ming Lei <ming.lei@...hat.com>
>>>> Since Linux next-20190218 I've observed problems with block layer on one
>>>> of my test devices (Odroid U3 with EXT4 rootfs on SD card). Bisecting
>>>> this issue led me to this change. This is also the first linux-next
>>>> release with this change merged. The issue is fully reproducible and can
>>>> be observed in the following kernel log:
>>>>
>>>> sdhci: Secure Digital Host Controller Interface driver
>>>> sdhci: Copyright(c) Pierre Ossman
>>>> s3c-sdhci 12530000.sdhci: clock source 2: mmc_busclk.2 (100000000 Hz)
>>>> s3c-sdhci 12530000.sdhci: Got CD GPIO
>>>> mmc0: SDHCI controller on samsung-hsmmc [12530000.sdhci] using ADMA
>>>> mmc0: new high speed SDHC card at address aaaa
>>>> mmcblk0: mmc0:aaaa SL16G 14.8 GiB
>>>>
>>>> ...
>>>>
>>>> EXT4-fs (mmcblk0p2): INFO: recovery required on readonly filesystem
>>>> EXT4-fs (mmcblk0p2): write access will be enabled during recovery
>>>> EXT4-fs (mmcblk0p2): recovery complete
>>>> EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
>>>> VFS: Mounted root (ext4 filesystem) readonly on device 179:2.
>>>> devtmpfs: mounted
>>>> Freeing unused kernel memory: 1024K
>>>> hub 1-3:1.0: USB hub found
>>>> Run /sbin/init as init process
>>>> hub 1-3:1.0: 3 ports detected
>>>> *** stack smashing detected ***: <unknown> terminated
>>>> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
>>>> CPU: 1 PID: 1 Comm: init Not tainted 5.0.0-rc6-next-20190218 #1546
>>>> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>>>> [<c01118d0>] (unwind_backtrace) from [<c010d794>] (show_stack+0x10/0x14)
>>>> [<c010d794>] (show_stack) from [<c09ff8a4>] (dump_stack+0x90/0xc8)
>>>> [<c09ff8a4>] (dump_stack) from [<c0125944>] (panic+0xfc/0x304)
>>>> [<c0125944>] (panic) from [<c012bc98>] (do_exit+0xabc/0xc6c)
>>>> [<c012bc98>] (do_exit) from [<c012c100>] (do_group_exit+0x3c/0xbc)
>>>> [<c012c100>] (do_group_exit) from [<c0138908>] (get_signal+0x130/0xbf4)
>>>> [<c0138908>] (get_signal) from [<c010c7a0>] (do_work_pending+0x130/0x618)
>>>> [<c010c7a0>] (do_work_pending) from [<c0101034>]
>>>> (slow_work_pending+0xc/0x20)
>>>> Exception stack(0xe88c3fb0 to 0xe88c3ff8)
>>>> 3fa0: 00000000 bea7787c 00000005
>>>> b6e8d0b8
>>>> 3fc0: bea77a18 b6f92010 b6e8d0b8 00000001 b6e8d0c8 00000001 b6e8c000
>>>> bea77b60
>>>> 3fe0: 00000020 bea77998 ffffffff b6d52368 60000050 ffffffff
>>>> CPU3: stopping
>>>>
>>>> I would like to help debugging and fixing this issue, but I don't really
>>>> have idea where to start. Here are some more detailed information about
>>>> my test system:
>>>>
>>>> 1. Board: ARM 32bit Samsung Exynos4412-based Odroid U3 (device tree
>>>> source: arch/arm/boot/dts/exynos4412-odroidu3.dts)
>>>>
>>>> 2. Block device: MMC/SDHCI/SDHCI-S3C with SD card
>>>> (drivers/mmc/host/sdhci-s3c.c driver, sdhci_2 device node in the device
>>>> tree)
>>>>
>>>> 3. Rootfs: Ext4
>>>>
>>>> 4. Kernel config: arch/arm/configs/exynos_defconfig
>>>>
>>>> I can gather more logs if needed, just let me which kernel option to
>>>> enable. Reverting this commit on top of next-20190218 as well as current
>>>> linux-next (tested with next-20190221) fixes this issue and makes the
>>>> system bootable again.
>>> Could you test the patch in following link and see if it can make a difference?
>>>
>>> https://marc.info/?l=linux-aio&m=155070355614541&w=2
>> I've tested that patch, but it doesn't make any difference on the test
>> system. In the log I see no warning added by it.
> I guess it might be related with memory corruption, could you enable the
> following debug options and post the dmesg log?
>
> CONFIG_DEBUG_STACKOVERFLOW=y
> CONFIG_KASAN=y
It won't be that easy as none of the above options is available on ARM
32bit. I will try to apply some ARM KASAN patches floating on the net
and let you know the result.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists