lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAED-sicdkasoUwtbE5kg_UuRBswgKqTxhMR2qi-jdYokxZWJcg@mail.gmail.com>
Date:   Thu, 9 Apr 2020 19:38:51 +0200
From:   Jesper Krogh <jesper.krogh@...il.com>
To:     Jeff Layton <jlayton@...nel.org>, netdev@...r.kernel.org
Subject: Page allocation error - CephFS kernel client hang - 5.4.20 kernel.

Hi.

Jeff suggested me to post it here to se if some of you can figure this out.
The system is VM running 16GB memory 16 v-cores KVM/QEMU  client is 5.4.20
and the workload is the "bacula" filedaemon essentially reading data
off a cephfs
mountpoint only with the purpose of shipping it to the network to the
tape-library.
Concurrency is 3-5 threads reading catalogs sequentially -- nothing
else  (except
 standard distro stuff) on the host.

I have indications from a physical machine that I can reproduce it
there  - although kernel 5.2.8 and a mixed workload but same end
result.

Attached is the full kernel.log from bootup until the "crash" where the CephFS
mountpoint goes stale and cannot be recovered without a hard reboot of the VM.

Thread on Ceph-devel is here:
https://www.spinics.net/lists/ceph-devel/msg48034.html

When this happens the VM has "very high" iowait  - because reading
from Ceph essentially "only" is waiting for data to come from the
network. But it can be as high as 150+ on a machine with 16 cores and
5 active threads reading from CephFS mountpoint - It feels excessive
to me - but if it worked for me I wouldn't care about it.

Apr  5 14:20:45 wombat kernel: [10956.183074] kworker/7:0: page
allocation failure: order:0, mode:0xa20(GFP_ATOMIC),
nodemask=(null),cpuset=/,mems_allowed=0
Apr  5 14:20:45 wombat kernel: [10956.183089] CPU: 7 PID: 7499 Comm:
kworker/7:0 Not tainted 5.4.30 #5
Apr  5 14:20:45 wombat kernel: [10956.183090] Hardware name: Bochs
Bochs, BIOS Bochs 01/01/2011
Apr  5 14:20:45 wombat kernel: [10956.183141] Workqueue: ceph-msgr
ceph_con_workfn [libceph]
Apr  5 14:20:45 wombat kernel: [10956.183143] Call Trace:
Apr  5 14:20:45 wombat kernel: [10956.183146]  <IRQ>
Apr  5 14:20:45 wombat kernel: [10956.183279]  dump_stack+0x6d/0x95
Apr  5 14:20:45 wombat kernel: [10956.183284]  warn_alloc+0x10c/0x170
Apr  5 14:20:45 wombat kernel: [10956.183288]
__alloc_pages_slowpath+0xe6c/0xef0
Apr  5 14:20:45 wombat kernel: [10956.183290]
__alloc_pages_nodemask+0x2f3/0x360
Apr  5 14:20:45 wombat kernel: [10956.183295]  alloc_pages_current+0x6a/0xe0
Apr  5 14:20:45 wombat kernel: [10956.183301]  skb_page_frag_refill+0xda/0x100
Apr  5 14:20:45 wombat kernel: [10956.183310]
try_fill_recv+0x285/0x6f0 [virtio_net]
Apr  5 14:20:45 wombat kernel: [10956.183314]
virtnet_poll+0x32d/0x364 [virtio_net]
Apr  5 14:20:45 wombat kernel: [10956.183321]  net_rx_action+0x265/0x3e0
Apr  5 14:20:45 wombat kernel: [10956.183328]  __do_softirq+0xf9/0x2aa
Apr  5 14:20:45 wombat kernel: [10956.183332]  irq_exit+0xae/0xb0
Apr  5 14:20:45 wombat kernel: [10956.183335]  do_IRQ+0x59/0xe0
Apr  5 14:20:45 wombat kernel: [10956.183337]  common_interrupt+0xf/0xf
Apr  5 14:20:45 wombat kernel: [10956.183338]  </IRQ>
Apr  5 14:20:45 wombat kernel: [10956.183341] RIP:
0010:kvm_clock_get_cycles+0x1/0x20
Apr  5 14:20:45 wombat kernel: [10956.183343] Code: 89 04 10 75 d6 b8
f4 ff ff ff 5d c3 48 c1 e1 06 31 c0 48 81 c1 00 10 60 89 49 89 0c 10
eb be 66 2e 0f 1f 84 00 00 00 00 00 55 <48> 89 e5 65 48 8b 3d e4 8f 7a
78 e8 ff 0f 00 00 5d c3 0f 1f 00 66
Apr  5 14:20:45 wombat kernel: [10956.183344] RSP:
0018:ffffaf6881cb7ab8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
Apr  5 14:20:45 wombat kernel: [10956.183346] RAX: ffffffff8786e040
RBX: 000009f6c5a66e6c RCX: 0000000000001000
Apr  5 14:20:45 wombat kernel: [10956.183347] RDX: 0000000000000000
RSI: 0000000000001000 RDI: ffffffff88e3c0e0
Apr  5 14:20:45 wombat kernel: [10956.183348] RBP: ffffaf6881cb7ae0
R08: 000000000002241a R09: 0000000000001000
Apr  5 14:20:45 wombat kernel: [10956.183348] R10: 0000000000000000
R11: 0000000000001000 R12: 0000000000000000
Apr  5 14:20:45 wombat kernel: [10956.183349] R13: 0000000000176864
R14: ffff9fa900931fb4 R15: 0000000000000000
Apr  5 14:20:45 wombat kernel: [10956.183351]  ? kvmclock_setup_percpu+0x80/0x80
Apr  5 14:20:45 wombat kernel: [10956.183354]  ? ktime_get+0x3e/0xa0
Apr  5 14:20:45 wombat kernel: [10956.183365]  tcp_mstamp_refresh+0x12/0x40
Apr  5 14:20:45 wombat kernel: [10956.183368]  tcp_rcv_space_adjust+0x22/0x1d0
Apr  5 14:20:45 wombat kernel: [10956.183369]  tcp_recvmsg+0x28b/0xbc0
Apr  5 14:20:45 wombat kernel: [10956.183374]  ? aa_sk_perm+0x43/0x190
Apr  5 14:20:45 wombat kernel: [10956.183380]  inet_recvmsg+0x64/0xf0
Apr  5 14:20:45 wombat kernel: [10956.183384]  sock_recvmsg+0x66/0x70
Apr  5 14:20:45 wombat kernel: [10956.183392]
ceph_tcp_recvpage+0x79/0xb0 [libceph]
Apr  5 14:20:45 wombat kernel: [10956.183400]
read_partial_message+0x3c3/0x7c0 [libceph]
Apr  5 14:20:45 wombat kernel: [10956.183407]
ceph_con_workfn+0xa6a/0x23d0 [libceph]
Apr  5 14:20:45 wombat kernel: [10956.183409]  ? __switch_to_asm+0x40/0x70
Apr  5 14:20:45 wombat kernel: [10956.183410]  ? __switch_to_asm+0x34/0x70
Apr  5 14:20:45 wombat kernel: [10956.183411]  ? __switch_to_asm+0x40/0x70
Apr  5 14:20:45 wombat kernel: [10956.183413]  ? __switch_to_asm+0x34/0x70
Apr  5 14:20:45 wombat kernel: [10956.183415]  process_one_work+0x167/0x400
Apr  5 14:20:45 wombat kernel: [10956.183416]  worker_thread+0x4d/0x460
Apr  5 14:20:45 wombat kernel: [10956.183419]  kthread+0x105/0x140
Apr  5 14:20:45 wombat kernel: [10956.183420]  ? rescuer_thread+0x370/0x370
Apr  5 14:20:45 wombat kernel: [10956.183421]  ?
kthread_destroy_worker+0x50/0x50
Apr  5 14:20:45 wombat kernel: [10956.183423]  ret_from_fork+0x35/0x40
Apr  5 14:20:45 wombat kernel: [10956.183425] Mem-Info:
Apr  5 14:20:45 wombat kernel: [10956.183430] active_anon:39163
inactive_anon:39973 isolated_anon:0
Apr  5 14:20:45 wombat kernel: [10956.183430]  active_file:85386
inactive_file:3756171 isolated_file:32
Apr  5 14:21:02 wombat kernel: [10956.183430]  unevictable:0 dirty:0
writeback:0 unstable:0
Apr  5 14:21:02 wombat kernel: [10956.183430]  slab_reclaimable:81555
slab_unreclaimable:45634
Apr  5 14:21:02 wombat kernel: [10956.183430]  mapped:4433 shmem:560
pagetables:980 bounce:0
Apr  5 14:21:02 wombat kernel: [10956.183430]  free:33869 free_pcp:103
free_cma:0
Apr  5 14:21:02 wombat kernel: [10956.183433] Node 0
active_anon:156652kB inactive_anon:159892kB active_file:341544kB
inactive_file:15024684kB unevictable:0kB isolated(anon):0kB
isolated(file):128kB mapped:17732kB dirty:0kB writeback:0kB
shmem:2240kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Apr  5 14:21:02 wombat kernel: [10956.183434] Node 0 DMA free:15908kB
min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB
pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Apr  5 14:21:02 wombat kernel: [10956.183437] lowmem_reserve[]: 0 3444
15930 15930 15930
Apr  5 14:21:02 wombat kernel: [10956.183439] Node 0 DMA32
free:55284kB min:14596kB low:18244kB high:21892kB active_anon:4452kB
inactive_anon:9072kB active_file:5560kB inactive_file:3333524kB
unevictable:0kB writepending:0kB present:3653608kB managed:3588072kB
mlocked:0kB kernel_stack:740kB pagetables:32kB bounce:0kB
free_pcp:164kB local_pcp:164kB free_cma:0kB
Apr  5 14:21:02 wombat kernel: [10956.183442] lowmem_reserve[]: 0 0
12485 12485 12485
Apr  5 14:21:02 wombat kernel: [10956.183444] Node 0 Normal
free:64284kB min:171972kB low:185200kB high:198428kB
active_anon:152200kB inactive_anon:150820kB active_file:335984kB
inactive_file:11691288kB unevictable:0kB writepending:0kB
present:13107200kB managed:12793540kB mlocked:0kB kernel_stack:6684kB
pagetables:3888kB bounce:0kB free_pcp:248kB local_pcp:248kB
free_cma:0kB
Apr  5 14:21:02 wombat kernel: [10956.183447] lowmem_reserve[]: 0 0 0 0 0
Apr  5 14:21:02 wombat kernel: [10956.183448] Node 0 DMA: 1*4kB (U)
0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB
1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15908kB
Apr  5 14:21:02 wombat kernel: [10956.183455] Node 0 DMA32: 72*4kB
(MEH) 172*8kB (UMEH) 310*16kB (UH) 155*32kB (UEH) 265*64kB (UMH)
113*128kB (UMEH) 34*256kB (UE) 3*512kB (UM) 2*1024kB (M) 0*2048kB
0*4096kB = 55296kB
Apr  5 14:21:02 wombat kernel: [10956.183462] Node 0 Normal: 392*4kB
(UMEH) 792*8kB (UMEH) 258*16kB (UMH) 318*32kB (MEH) 648*64kB (MH)
10*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 64960kB
Apr  5 14:21:02 wombat kernel: [10956.183469] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Apr  5 14:21:02 wombat kernel: [10956.183470] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr  5 14:21:02 wombat kernel: [10956.183471] 3843456 total pagecache pages
Apr  5 14:21:02 wombat kernel: [10956.183473] 1281 pages in swap cache
Apr  5 14:21:02 wombat kernel: [10956.183475] Swap cache stats: add
13816, delete 12535, find 3296/6045
Apr  5 14:21:02 wombat kernel: [10956.183475] Free swap  = 4004652kB
Apr  5 14:21:02 wombat kernel: [10956.183476] Total swap = 4038652kB
Apr  5 14:21:02 wombat kernel: [10956.183476] 4194200 pages RAM
Apr  5 14:21:02 wombat kernel: [10956.183477] 0 pages HighMem/MovableOnly
Apr  5 14:21:02 wombat kernel: [10956.183477] 94820 pages reserved
Apr  5 14:21:02 wombat kernel: [10956.183478] 0 pages cma reserved
Apr  5 14:21:02 wombat kernel: [10956.183478] 0 pages hwpoisoned
Apr  5 14:25:15 wombat kernel: [11225.534971] warn_alloc: 43 callbacks
suppressed

Download attachment "kern.log" of type "application/octet-stream" (674761 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ