linux-kernel - Re: [PATCH 00/16] The new slab memory controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191209180418.GA15797@localhost.localdomain>
Date:   Mon, 9 Dec 2019 18:04:22 +0000
From:   Roman Gushchin <guro@...com>
To:     Bharata B Rao <bharata@...ux.ibm.com>
CC:     "mhocko@...nel.org" <mhocko@...nel.org>,
        "hannes@...xchg.org" <hannes@...xchg.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "shakeelb@...gle.com" <shakeelb@...gle.com>,
        "vdavydov.dev@...il.com" <vdavydov.dev@...il.com>,
        "longman@...hat.com" <longman@...hat.com>
Subject: Re: [PATCH 00/16] The new slab memory controller

On Mon, Dec 09, 2019 at 05:26:49PM +0530, Bharata B Rao wrote:
> On Mon, Dec 09, 2019 at 02:47:52PM +0530, Bharata B Rao wrote:
> > Hi,
> > 
> > I see the below crash during early boot when I try this patchset on
> > PowerPC host. I am on new_slab.rfc.v5.3 branch.
> > 
> > BUG: Unable to handle kernel data access at 0x81030236d1814578
> > Faulting instruction address: 0xc0000000002cc314
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
> > Modules linked in: ip_tables x_tables autofs4 sr_mod cdrom usbhid bnx2x crct10dif_vpmsum crct10dif_common mdio libcrc32c crc32c_vpmsum
> > CPU: 31 PID: 1752 Comm: keyboard-setup. Not tainted 5.3.0-g9bd85fd72a0c #155
> > NIP:  c0000000002cc314 LR: c0000000002cc2e8 CTR: 0000000000000000
> > REGS: c000001e40f378b0 TRAP: 0380   Not tainted  (5.3.0-g9bd85fd72a0c)
> > MSR:  900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022224  XER: 00000000
> > CFAR: c0000000002c6ad4 IRQMASK: 1 
> > GPR00: c0000000000b8a40 c000001e40f37b40 c000000000ed9600 0000000000000000 
> > GPR04: 0000000000000023 0000000000000010 c000001e40f37b24 c000001e3cba3400 
> > GPR08: 0000000000000020 81030218815f4578 0000001e50220000 0000000000000030 
> > GPR12: 0000000000002200 c000001fff774d80 0000000000000000 00000001072600d8 
> > GPR16: 0000000000000000 c0000000000bbaac 0000000000000000 0000000000000000 
> > GPR20: c000001e40f37c48 0000000000000001 0000000000000000 c000001e3cba3400 
> > GPR24: c000001e40f37dd8 0000000000000000 c000000000fa0d58 0000000000000000 
> > GPR28: c000001e3a080080 c000001e32da0100 0000000000000118 0000000000000010 
> > NIP [c0000000002cc314] __mod_memcg_state+0x58/0xd0
> > LR [c0000000002cc2e8] __mod_memcg_state+0x2c/0xd0
> > Call Trace:
> > [c000001e40f37b90] [c0000000000b8a40] account_kernel_stack+0xa4/0xe4
> > [c000001e40f37bd0] [c0000000000ba4a4] copy_process+0x2b4/0x16f0
> > [c000001e40f37cf0] [c0000000000bbaac] _do_fork+0x9c/0x3e4
> > [c000001e40f37db0] [c0000000000bc030] sys_clone+0x74/0xa8
> > [c000001e40f37e20] [c00000000000bb34] ppc_clone+0x8/0xc
> > Instruction dump:
> > 4bffa7e9 2fa30000 409e007c 395efffb 3d000020 2b8a0001 409d0008 39000020 
> > e93d0718 e94d0028 7bde1f24 7d29f214 <7ca9502a> 7fff2a14 7fe9fe76 7d27fa78 
> > 
> > Looks like page->mem_cgroup_vec is allocated but not yet initialized
> > with memcg pointers when we try to access them.
> > 
> > I did get past the crash by initializing the pointers like this
> > in account_kernel_stack(),
> 
> The above is not an accurate description of the hack I showed below.
> Essentially I am making sure that I get to the memcg corresponding
> to task_struct_cachep object in the page.

Hello, Bharata!

Thank you very much for the report and the patch, it's a good catch,
and the code looks good to me. I'll include the fix into the next
version of the patchset (I can't keep it as a separate fix due to massive
renamings/rewrites).

> 
> But that still doesn't explain why we don't hit this problem on x86.

On x86 (and arm64) we're using vmap-based stacks, so the underlying memory is
allocated directly by the page allocator, bypassing the slab allocator.
It depends on CONFIG_VMAP_STACK.

Btw, thank you for looking into the patchset and trying it on powerpc.
Would you mind to share some results?

Thank you!

Roman