[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5cad2529-8776-687e-58d0-4fb9e2ec59b1@amlogic.com>
Date: Mon, 25 Mar 2019 18:04:17 +0800
From: Liang Yang <liang.yang@...ogic.com>
To: Martin Blumenstingl <martin.blumenstingl@...glemail.com>,
Matthew Wilcox <willy@...radead.org>
CC: <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
<linux-arm-kernel@...ts.infradead.org>,
<akpm@...ux-foundation.org>, <mhocko@...e.com>,
<rppt@...ux.ibm.com>, <linux-amlogic@...ts.infradead.org>,
<linux@...linux.org.uk>, <linux-mtd@...ts.infradead.org>
Subject: Re: 32-bit Amlogic (ARM) SoC: kernel BUG in kfree()
Hi Martin,
On 2019/3/23 5:07, Martin Blumenstingl wrote:
> Hi Matthew,
>
> On Thu, Mar 21, 2019 at 10:44 PM Matthew Wilcox <willy@...radead.org> wrote:
>>
>> On Thu, Mar 21, 2019 at 09:17:34PM +0100, Martin Blumenstingl wrote:
>>> Hello,
>>>
>>> I am experiencing the following crash:
>>> ------------[ cut here ]------------
>>> kernel BUG at mm/slub.c:3950!
>>
>> if (unlikely(!PageSlab(page))) {
>> BUG_ON(!PageCompound(page));
>>
>> You called kfree() on the address of a page which wasn't allocated by slab.
>>
>>> I have traced this crash to the kfree() in meson_nfc_read_buf().
>>> my observation is as follows:
>>> - meson_nfc_read_buf() is called 7 times without any crash, the
>>> kzalloc() call returns 0xe9e6c600 (virtual address) / 0x29e6c600
>>> (physical address)
>>> - the eight time meson_nfc_read_buf() is called kzalloc() call returns
>>> 0xee39a38b (virtual address) / 0x2e39a38b (physical address) and the
>>> final kfree() crashes
>>> - changing the size in the kzalloc() call from PER_INFO_BYTE (= 8) to
>>> PAGE_SIZE works around that crash
>>
>> I suspect you're doing something which corrupts memory. Overrunning
>> the end of your allocation or something similar. Have you tried KASAN
>> or even the various slab debugging (eg redzones)?
> KASAN is not available on 32-bit ARM. there was some progress last
> year [0] but it didn't make it into mainline. I tried to make the
> patches apply again and got it to compile (and my kernel is still
> booting) but I have no idea if it's still working. for anyone
> interested, my patches are here: [1] (I consider this a HACK because I
> don't know anything about the code which is being touched in the
> patches, I only made it compile)
>
> SLAB debugging (redzones) were a great hint, thank you very much for
> that Matthew! I enabled:
> CONFIG_SLUB_DEBUG=y
> CONFIG_SLUB_DEBUG_ON=y
> and with that I now get "BUG kmalloc-64 (Not tainted): Redzone
> overwritten" (a larger kernel log extract is attached).
>
> I'm starting to wonder if the NAND controller (hardware) writes more
> than 8 bytes.
> some context: the "info" buffer allocated in meson_nfc_read_buf is
> then passed to the NAND controller IP (after using dma_map_single).
>
> Liang, how does the NAND controller know that it only has to send
> PER_INFO_BYTE (= 8) bytes when called from meson_nfc_read_buf? all
> other callers of meson_nfc_dma_buffer_setup (which passes the info
> buffer to the hardware) are using (nand->ecc.steps * PER_INFO_BYTE)
> bytes?
>
NFC_CMD_N2M and CMDRWGEN are different commands. CMDRWGEN needs to set
the ecc page size (1KB or 512B) and Pages(2, 4, 8, ...), so
PER_INFO_BYTE(= 8) bytes for each ecc page.
I have never used NFC_CMD_N2M to transfer data before, because it is
very low efficient. And I do a experiment with the attachment and find
on overwritten on my meson axg platform.
Martin, I would appreciate it very much if you would try the attachment
on your meson m8b platform.
>
> Regards
> Martin
>
>
> [0] https://lore.kernel.org/patchwork/cover/913212/
> [1] https://github.com/xdarklight/linux/tree/arm-kasan-hack-v5.1-rc1
>
View attachment "nand_debug.diff" of type "text/plain" (1104 bytes)
Powered by blists - more mailing lists