lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e095c3dd-42e0-fc4f-5670-7db6393e96a7@quicinc.com>
Date: Mon, 22 Jan 2024 21:58:58 +0530
From: Charan Teja Kalla <quic_charante@...cinc.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Al Viro
	<viro@...iv.linux.org.uk>
CC: <akpm@...ux-foundation.org>, <willy@...radead.org>, <brauner@...nel.org>,
        <jack@...e.cz>, <linux-mm@...ck.org>, <linux-fsdevel@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] fs: improve dump_mapping() robustness



On 1/22/2024 12:47 PM, Baolin Wang wrote:
>>
>> We too seen the below crash while printing the dentry name.
>>
>> aops:shmem_aops ino:5e029 dentry name:"dev/zero"
>> flags:
>> 0x8000000000080006(referenced|uptodate|swapbacked|zone=2|kasantag=0x0)
>> raw: 8000000000080006 ffffffc033b1bb60 ffffffc033b1bb60 ffffff8862537600
>> raw: 0000000000000001 0000000000000000 00000003ffffffff ffffff807fe64000
>> page dumped because: migration failure
>> migrating pfn aef223 failed ret:1
>> page:000000009e72a120 refcount:3 mapcount:0 mapping:000000003325dda1
>> index:0x1 pfn:0xaef223
>> memcg:ffffff807fe64000
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 0000000000000000
>> Mem abort info:
>>    ESR = 0x0000000096000005
>>    EC = 0x25: DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>    FSC = 0x05: level 1 translation fault
>> Data abort info:
>>    ISV = 0, ISS = 0x00000005
>>    CM = 0, WnR = 0
>> user pgtable: 4k pages, 39-bit VAs, pgdp=000000090c12d000
>> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000,
>> pud=0000000000000000
>> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
>>
>> dentry_name+0x1f8/0x3a8
>> pointer+0x3b0/0x6b8
>> vsnprintf+0x4a4/0x65c
>> vprintk_store+0x168/0x4a8
>> vprintk_emit+0x98/0x218
>> vprintk_default+0x44/0x70
>> vprintk+0xf0/0x138
>> _printk+0x54/0x80
>> dump_mapping+0x17c/0x188
>> dump_page+0x1d0/0x2e8
>> offline_pages+0x67c/0x898
>>
>>
>>
>> Not much comfortable with block layer internals, TMK, the below is what
>> happening in the my case:
>> memoffline                 dput()
>> (offline_pages)         (as part of closing of the shmem file)
>> ------------         --------------------------------------
>>                     .......
>>             1) dentry_unlink_inode()
>>                   hlist_del_init(&dentry->d_u.d_alias);
>>
>>             2) iput():
>>                 a) inode->i_state |= I_FREEING
>>                 .....
>>                 b) evict_inode()->..->shmem_undo_range
>>                    1) get the folios with elevated refcount
>> 3) do_migrate_range():
>>     a) Because of the elevated
>>     refcount in 2.b.1, the
>>     migration of this page will
>>     be failed.
>>
>>                    2) truncate_inode_folio() ->
>>                      filemap_remove_folio():
>>                   (deletes from the page cache,
>>                  set page->mapping=NULL,
>>                  decrement the refcount on folio)
>>    b) Call dump_page():
>>       1) mapping = page_mapping(page);
>>       2) dump_mapping(mapping)
>>       a) We unlinked the dentry in 1)
>>             thus dentry_ptr from host->i_dentry.first
>>             is not a proper one.
>>
>>           b) dentry name print with %pd is resulting into
>>        the mentioned crash.
>>
>>
>> At least in this case, I think __this patchset in its current form can
>> help us__.
> 
> This looks another case of NULL pointer access. Thanks for the detailed
> analysis. Could you provide a Tested-by or Reviewed-by tag if it can
> solve your problem?
Seen this issue couple of times, over 3 months back. Not sure if we ever
encounter this issue again. Still, will pick this and let you know the
side effects of this patch, after thorough testing.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ