lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Jun 2021 23:29:59 +0800
From:   Oliver Sang <oliver.sang@...el.com>
To:     Rasmus Villemoes <linux@...musvillemoes.dk>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Jessica Yu <jeyu@...nel.org>, Borislav Petkov <bp@...en8.de>,
        Jonathan Corbet <corbet@....net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Takashi Iwai <tiwai@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com
Subject: Re: [init/initramfs.c] e7cb072eb9: invoked_oom-killer:gfp_mask=0x

hi Rasmus,

On Fri, Jun 11, 2021 at 11:20:18AM +0200, Rasmus Villemoes wrote:
> On 11/06/2021 10.48, Oliver Sang wrote:
> > hi Rasmus,
> > 
> > On Tue, Jun 08, 2021 at 09:42:58AM +0200, Rasmus Villemoes wrote:
> >> On 07/06/2021 16.44, kernel test robot wrote:
> >>>
> 
> 
> >> Also, I don't have 16G to give to a virtual machine. I tried running the
> >> bzImage with that modules.cgz under qemu with some naive parameters just
> >> to get some output [1], but other than failing because there's no rootfs
> >> to mount (as expected), I only managed to make it fail when providing
> >> too little memory (the .cgz is around 70M, decompressed about 200M -
> >> giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
> >> but I can't make the decompression fail when passing either vmalloc=128M
> >> or vmalloc=512M or no vmalloc= at all.
> > 
> > sorry about this. we also tried to follow exactly above steps to test on
> > some local machine (8G memory), but cannot reproduce. we are analyzing
> > what's the diference in our automaion run in test cluster, which reproduced
> > the issue consistently. will update you when we have findings.
> 
> OK. It's really odd that providing the VM with _more_ memory makes it
> fail (other then the obvious failure in the other direction when there's
> simply not enough memory for the unpacked initramfs itself). But
> unfortunately that also sounds like I won't be able to reproduce with
> the HW I have.
> 
> >> As an extra data point, what happens if you add initramfs_async=0 to the
> >> command line?
> > 
> > yes, we tested this before sending out the report. the issue gone
> > if initramfs_async=0 is added.
> 
> Hm. Sounds like some initcall after rootfs_initcall time must
> allocate/hog a lot of memory, perhaps with some heuristic depending on
> how much is available.
> 
> Can you try with initcall_debug=1? I think that should produce a lot of
> output, hopefully that would make it possible to see which initcalls
> have been done just prior to (or while) the initramfs unpacking hits ENOMEM.

thanks a lot for guidance!

by several jobs with initcall_debug=1, the oom happens consistently after
"calling  panel_init_module"

below is an example, and detail dmesg.xz attached:

[   15.120470] calling  img_ascii_lcd_driver_init+0x0/0x16 @ 1
[   15.122305] initcall img_ascii_lcd_driver_init+0x0/0x16 returned 0 after 128 usecs
[   15.124660] calling  panel_init_module+0x0/0x1f3 @ 1
[   19.941826] kworker/u4:0 invoked oom-killer: gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0, oom_score_adj=0
[   19.967679] CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
[   19.968362] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   19.968362] Workqueue: events_unbound async_run_entry_fn
[   19.968362] Call Trace:
[   19.968362]  dump_stack+0x69/0x85
[   19.968362]  dump_header+0x57/0x3ff
[   19.968362]  ? lock_release+0xa8/0x2b0
[   19.968362]  ? out_of_memory+0x310/0x4d0
[   19.968362]  out_of_memory.cold+0x2b/0x6f
[   19.968362]  __alloc_pages_slowpath+0xa3a/0xbb0
[   19.968362]  __alloc_pages+0x2a6/0x320
[   19.968362]  allocate_slab+0x242/0x300
[   19.968362]  ___slab_alloc+0x1e3/0x740
[   19.968362]  ? alloc_inode+0x77/0x80
[   19.968362]  ? lock_is_held_type+0x8a/0xf0
[   19.968362]  __slab_alloc+0x27/0x60
[   19.968362]  ? __slab_alloc+0x27/0x60
[   19.968362]  ? alloc_inode+0x77/0x80
[   19.968362]  kmem_cache_alloc+0x24d/0x290
[   19.968362]  ? alloc_inode+0x77/0x80
[   19.968362]  alloc_inode+0x77/0x80
[   19.968362]  new_inode_pseudo+0xf/0x50
[   19.968362]  new_inode+0x17/0x30
[   19.968362]  ramfs_get_inode+0x35/0x150
[   19.968362]  ? _raw_spin_unlock_irqrestore+0x2d/0x60
[   19.968362]  ? trace_hardirqs_on+0x48/0xf0
[   19.968362]  ? _raw_spin_unlock_irqrestore+0x38/0x60
[   19.968362]  ? lock_is_held_type+0x8a/0xf0
[   19.968362]  ramfs_mknod+0x36/0xa0
[   19.968362]  ? ramfs_mknod+0x36/0xa0
[   19.968362]  ? trace_preempt_on+0x36/0xd0
[   19.968362]  ? d_add+0xe4/0x1c0
[   19.968362]  ? ramfs_mkdir+0x50/0x50
[   19.968362]  ramfs_create+0x20/0x30
[   19.968362]  ? ramfs_create+0x20/0x30
[   19.968362]  path_openat+0x594/0xef0
[   19.968362]  do_filp_open+0x84/0xf0
[   19.968362]  ? lock_is_held_type+0x8a/0xf0
[   19.968362]  file_open_name+0xaf/0x1b0
[   19.968362]  filp_open+0x27/0x50
[   19.968362]  do_name+0xbc/0x27c
[   19.968362]  write_buffer+0x22/0x31
[   19.968362]  flush_buffer+0x26/0x74
[   19.968362]  __gunzip+0x204/0x28a
[   19.968362]  ? decompress_method+0x3b/0x3b
[   19.968362]  ? __gunzip+0x28a/0x28a
[   19.968362]  gunzip+0x16/0x18
[   19.968362]  ? write_buffer+0x31/0x31
[   19.968362]  ? initrd_load+0x23d/0x23d
[   19.968362]  unpack_to_rootfs+0x15d/0x273
[   19.968362]  ? write_buffer+0x31/0x31
[   19.968362]  ? initrd_load+0x23d/0x23d
[   19.968362]  do_populate_rootfs+0x5f/0x13f
[   19.968362]  async_run_entry_fn+0x24/0xb0
[   19.968362]  process_one_work+0x1ee/0x540
[   19.968362]  worker_thread+0x15e/0x3c0
[   19.968362]  kthread+0x109/0x140
[   19.968362]  ? process_one_work+0x540/0x540
[   19.968362]  ? kthread_insert_work_sanity_check+0x60/0x60
[   19.968362]  ret_from_fork+0x19/0x30
[   20.217543] Mem-Info:
[   20.218400] active_anon:0 inactive_anon:0 isolated_anon:0
[   20.218400]  active_file:0 inactive_file:0 isolated_file:0
[   20.218400]  unevictable:85038 dirty:0 writeback:0
[   20.218400]  slab_reclaimable:7910 slab_unreclaimable:1402
[   20.218400]  mapped:0 shmem:0 pagetables:4 bounce:0
[   20.218400]  free:3987860 free_pcp:117 free_cma:0
[   20.226846] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:340152kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:464kB pagetables:16kB all_unreclaimable? yes
[   20.233380] DMA free:572kB min:2296kB low:2356kB high:2416kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4664kB writepending:0kB present:15992kB managed:15916kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[   20.240158] lowmem_reserve[]: 0 48 15946 15946
[   20.241518] Normal free:2816kB min:2816kB low:3008kB high:3200kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:10008kB writepending:0kB present:481272kB managed:49344kB mlocked:0kB bounce:0kB free_pcp:28kB local_pcp:16kB free_cma:0kB
[   20.248350] lowmem_reserve[]: 0 0 127183 127183
[   20.249784] HighMem free:15948052kB min:512kB low:64120kB high:127728kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:325496kB writepending:0kB present:16279432kB managed:16279432kB mlocked:0kB bounce:0kB free_pcp:440kB local_pcp:208kB free_cma:0kB
[   20.257038] lowmem_reserve[]: 0 0 0 0
[   20.258272] DMA: 0*4kB 0*8kB 1*16kB (E) 1*32kB (E) 0*64kB 0*128kB 0*256kB 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 560kB
[   20.261437] Normal: 12*4kB (UME) 4*8kB (UM) 3*16kB (UM) 2*32kB (UE) 1*64kB (U) 2*128kB (UM) 1*256kB (M) 2*512kB (ME) 1*1024kB (U) 0*2048kB 0*4096kB = 2816kB
[   20.265294] HighMem: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 0*64kB 2*128kB (UM) 2*256kB (UM) 1*512kB (M) 1*1024kB (M) 2*2048kB (M) 3892*4096kB (M) = 15948052kB
[   20.269081] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[   20.271583] 85050 total pagecache pages
[   20.272767] 4194174 pages RAM
[   20.273839] 4069858 pages HighMem/MovableOnly
[   20.275209] 108001 pages reserved
[   20.276264] Tasks state (memory values in pages):
[   20.277590] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[   20.280179] Out of memory and no killable processes...
[   20.281709] Kernel panic - not syncing: System is deadlocked on memory



> 
> Thanks,
> Rasmus

Download attachment "dmesg.xz" of type "application/x-xz" (24636 bytes)

Powered by blists - more mailing lists