[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d689e8bf-6628-499e-8a11-c74ce1b1fd8b@redhat.com>
Date: Mon, 25 Mar 2024 20:06:25 +0800
From: Xiubo Li <xiubli@...hat.com>
To: David Hildenbrand <david@...hat.com>, linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org,
Ceph Development <ceph-devel@...r.kernel.org>, linux-fsdevel@...r.kernel.org
Subject: Re: kernel BUG at mm/usercopy.c:102 -- pc : usercopy_abort
On 3/25/24 18:14, David Hildenbrand wrote:
> On 25.03.24 08:45, Xiubo Li wrote:
>> Hi guys,
>>
>> We are hitting the same crash frequently recently with the latest kernel
>> when testing kceph, and the call trace will be something likes:
>>
>> [ 1580.034891] usercopy: Kernel memory exposure attempt detected from
>> SLUB object 'kmalloc-192' (offset 82, size 499712)!^M
>> [ 1580.045866] ------------[ cut here ]------------^M
>> [ 1580.050551] kernel BUG at mm/usercopy.c:102!^M
>> ^M
>> Entering kdb (current=0xffff8881211f5500, pid 172901) on processor 4
>> Oops: (null)^M
>> due to oops @ 0xffffffff8138cabd^M
>> CPU: 4 PID: 172901 Comm: fsstress Tainted: G S 6.6.0-g623393c9d50c #1^M
>> Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015^M
>> RIP: 0010:usercopy_abort+0x6d/0x80^M
>> Code: 4c 0f 44 d0 41 53 48 c7 c0 1c e9 13 82 48 c7 c6 71 62 13 82 48 0f
>> 45 f0 48 89 f9 48 c7 c7 f0 6b 1b 82 4c 89 d2 e8 63 2b df ff <0f> 0b 49
>> c7 c1 44 c8 14 82 4d 89 cb 4d 89 c8 eb a5 66 90 f3 0f 1e^M
>> RSP: 0018:ffffc90006dfba88 EFLAGS: 00010246^M
>> RAX: 000000000000006a RBX: 000000000007a000 RCX: 0000000000000000^M
>> RDX: 0000000000000000 RSI: ffff88885fd1d880 RDI: ffff88885fd1d880^M
>> RBP: 000000000007a000 R08: 0000000000000000 R09: c0000000ffffdfff^M
>> R10: 0000000000000001 R11: ffffc90006dfb930 R12: 0000000000000001^M
>> R13: ffff8882b7bbed12 R14: ffff88827a375830 R15: ffff8882b7b44d12^M
>> FS: 00007fb24c859500(0000) GS:ffff88885fd00000(0000)
>> knlGS:0000000000000000^M
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
>> CR2: 000055c2bcf9eb00 CR3: 000000028956c005 CR4: 00000000001706e0^M
>> Call Trace:^M
>> <TASK>^M
>> ? kdb_main_loop+0x32c/0xa10^M
>> ? kdb_stub+0x216/0x420^M
>> more>
>>
>> You can see more detail in ceph tracker
>> https://tracker.ceph.com/issues/64471.
>
> Where is the full backtrace? Above contains only the backtrace of kdb.
>
Hi David,
The bad news is that there is no more backtrace. All the failures we hit
are similar with the following logs:
> That link also contains:
>
> Entering kdb (current=0xffff9115d14fb980, pid 61925) on processor 5
> Oops: (null)^M
> due to oops @ 0xfffffffface3a1d2^M
> CPU: 5 PID: 61925 Comm: ld Kdump: loaded Not tainted
> 5.14.0-421.el9.x86_64 #1^M
> Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015^M
> RIP: 0010:usercopy_abort+0x74/0x76^M
> Code: 14 74 ad 51 48 0f 44 d6 49 c7 c3 cb 9f 73 ad 4c 89 d1 57 48 c7
> c6 60 83 75 ad 48 c7 c7 00 83 75 ad 49 0f 44 f3 e8 1b 3b ff ff <0f> 0b
> 0f b6 d3 4d 89 e0 48 89 e9 31 f6 48 c7 c7 7f 83 75 ad e8 73^M
> RSP: 0018:ffffbb97c16af8d0 EFLAGS: 00010246^M
> RAX: 0000000000000072 RBX: 0000000000000112 RCX: 0000000000000000^M
> RDX: 0000000000000000 RSI: ffff911d1fd60840 RDI: ffff911d1fd60840^M
> RBP: 0000000000004000 R08: 80000000ffff84b4 R09: 0000000000ffff0a^M
> R10: 0000000000000004 R11: 0000000000000076 R12: ffff9115c0be8b00^M
> R13: 0000000000000001 R14: ffff911665df9f68 R15: ffff9115d16be112^M
> FS: 00007ff20442eb80(0000) GS:ffff911d1fd40000(0000)
> knlGS:0000000000000000^M
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
> CR2: 00007ff20446142d CR3: 00000001215ec003 CR4: 00000000003706e0^M
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400^M
> Call Trace:^M
> <TASK>^M
> ? show_trace_log_lvl+0x1c4/0x2df^M
> more>
>
>
> Don't we have more information about the calltrace somewhere? (or a
> reproducer?)
There is no reproducer and each time the failure test cases are
different. So it seems randomly.
Thanks
- Xiubo
Powered by blists - more mailing lists