lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d337f6f52a64227599ed90455c6940f455f176ad.camel@hammerspace.com>
Date: Wed, 4 Sep 2024 19:32:51 +0000
From: Trond Myklebust <trondmy@...merspace.com>
To: "stephen@...workplumber.org" <stephen@...workplumber.org>,
	"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Fw: [Bug 219228] New: KVM guest boot up with call trace with
 mounted image due to NFS server connection is not stable

On Wed, 2024-09-04 at 10:18 -0700, Stephen Hemminger wrote:
> Looks more like NFS bug than a networking bug per se.

Maybe, but there is no information in the bug report to allow anyone to
conclude either way. All we can see is that something appears to be
slower when they use NFS.
It's not possible to tell if it is an environment change, a networking
layer change, a NFS/RPC client change or even a NFS server change.

Needs more data...

> 
> Begin forwarded message:
> 
> Date: Wed, 04 Sep 2024 02:32:26 +0000
> From: bugzilla-daemon@...nel.org
> To: stephen@...workplumber.org
> Subject: [Bug 219228] New: KVM guest boot up with call trace with
> mounted image due to NFS server connection is not stable
> 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=219228
> 
>             Bug ID: 219228
>            Summary: KVM guest boot up with call trace with mounted
> image
>                     due to NFS server connection is not stable
>            Product: Networking
>            Version: 2.5
>           Hardware: All
>                 OS: Linux
>             Status: NEW
>           Severity: normal
>           Priority: P3
>          Component: IPV4
>           Assignee: stephen@...workplumber.org
>           Reporter: xuelian.guo@...el.com
>         Regression: No
> 
> Environment:
> host kernel : 6.11.0-rc3 
> guest kernel :6.11.0-rc5
> qemu version : QEMU emulator version 9.0.92 (v9.1.0-rc2-28-
> g2eefd4fcec)
> 
> 
> Bug detail description: 
> try to boot KVM guest , there is call trace during kernel loading
> 
> Reproduce steps: 
> 1. set up nfs-server, and put guest image "centos9.img" into NFS-
> server's
> /images folder.
> 
> 2. mount NFS-server folder to local:
>    <NfS-server>:/images /share/guest_imgs nfs defaults,tcp,nolock 0 0
> 
> 3. create qcow2 image based on remote NFS-server
>     qemu-img created -b /share/guest_imgs/centos9.img -f qcow2
> centos9.qcow2 -F
> raw
> 
> 4. boot KVM guest with :
> qemu-system-x86_64 -accel kvm \
> -m 4096 \
> -smp 4 \
> -cpu host \
> -drive file=centos9.qcow2,if=none,id=virtio-disk0 \
> -device virtio-blk-pci,drive=virtio-disk0,bootindex=0 \
> -device virtio-net-pci,netdev=nic0 \
> -netdev user,id=nic0,hostfwd=tcp::10022-:22 \
> -nographic
> 
> Error log: 
> During KVM guest booting, there are call trace and Guest could boot
> up in the
> end, but boot time is long.
>  INFO: task kworker/u8:1:29 blocked for more than 122 seconds.
> [  246.987403]       Not tainted 6.11.0-rc5 #2
> [  246.989674] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this
> message.
> [  246.992986] task:kworker/u8:1    state:D stack:0     pid:29   
> tgid:29   
> ppid:2      flags:0x00004000
> [  246.996842] Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs]
> [  247.000029] Call Trace:
> [  247.001707]  <TASK>
> [  247.003263]  __schedule+0x276/0x710
> [  247.005381]  schedule+0x27/0xa0
> [  247.007300]  xlog_state_get_iclog_space+0x102/0x2a0 [xfs]
> [  247.010298]  ? __pfx_default_wake_function+0x10/0x10
> [  247.012800]  xlog_write+0x7b/0x260 [xfs]
> [  247.015379]  ? _raw_spin_unlock+0xe/0x30
> [  *** ] (5 of 5) A start job is running for…ernel arming (3min 39s /
> no limit)
> [  247.023496]  xlog_cil_push_work+0x50e/0x6f0 [xfs]
> [  247.026329]  process_one_work+0x158/0x360
> [  247.028584]  worker_thread+0x235/0x340
> [  247.030710]  ? __pfx_worker_thread+0x10/0x10
> [  247.032997]  kthread+0xd0/0x100
> [  247.034937]  ? __pfx_kthread+0x10/0x10
> [  247.037063]  ret_from_fork+0x31/0x50
> [  247.039161]  ? __pfx_kthread+0x10/0x10
> [  247.041302]  ret_from_fork_asm+0x1a/0x30
> [  247.043479]  </TASK>
> [  247.045019] INFO: task kworker/0:3:255 blocked for more than 122
> seconds.
> [  247.047775]       Not tainted 6.11.0-rc5 #2
> [  247.049802] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this
> message.
> [  247.052903] task:kworker/0:3     state:D stack:0     pid:255  
> tgid:255  
> ppid:2      flags:0x00004000
> [  247.056403] Workqueue: xfs-sync/dm-0 xfs_log_worker [xfs]
> [  247.058880] Call Trace:
> [  247.060293]  <TASK>
> [  247.061638]  __schedule+0x276/0x710
> [  247.063332]  schedule+0x27/0xa0
> [  247.064964]  schedule_timeout+0x14e/0x160
> [  247.066778]  __wait_for_common+0x8f/0x1d0
> [  247.068569]  ? __pfx_schedule_timeout+0x10/0x10
> [  247.070332]  __flush_workqueue+0x13e/0x3f0
> [  247.072011]  xlog_cil_push_now.isra.0+0x25/0x90 [xfs]
> [  247.074185]  xlog_cil_force_seq+0x79/0x240 [xfs]
> [  247.076253]  ? rpm_suspend+0x1ad/0x5e0
> [  247.077831]  ? _raw_spin_unlock+0xe/0x30
> [  247.079442]  ? rpm_suspend+0x5b6/0x5e0
> [  247.080987]  xfs_log_force+0x7a/0x230 [xfs]
> [  247.082764]  xfs_log_worker+0x3d/0xc0 [xfs]
> [  247.084564]  process_one_work+0x158/0x360
> [  247.086075]  worker_thread+0x235/0x340
> [  247.087462]  ? __pfx_worker_thread+0x10/0x10
> [  247.088895]  kthread+0xd0/0x100
> [  247.090108]  ? __pfx_kthread+0x10/0x10
> [  247.091410]  ret_from_fork+0x31/0x50
> [  247.092665]  ? __pfx_kthread+0x10/0x10
> [  247.093932]  ret_from_fork_asm+0x1a/0x30
> [  247.095251]  </TASK>
> [  247.096232] INFO: task xfsaild/dm-0:543 blocked for more than 122
> seconds.
> [  247.099379]       Not tainted 6.11.0-rc5 #2
> [  247.101574] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this
> message.
> [  247.104908] task:xfsaild/dm-0    state:D stack:0     pid:543  
> tgid:543  
> ppid:2      flags:0x00004000
> [  247.108707] Call Trace:
> [  247.110363]  <TASK>
> [  247.111914]  __schedule+0x276/0x710
> [  247.113892]  schedule+0x27/0xa0
> [  247.115756]  schedule_timeout+0x14e/0x160
> [  247.117899]  __wait_for_common+0x8f/0x1d0
> [  247.119998]  ? __pfx_schedule_timeout+0x10/0x10
> [  247.122276]  __flush_workqueue+0x13e/0x3f0
> [  247.124457]  ? ttwu_queue_wakelist+0xd0/0xf0
> [  247.126684]  xlog_cil_push_now.isra.0+0x25/0x90 [xfs]
> [  247.129559]  xlog_cil_force_seq+0x79/0x240 [xfs]
> [  247.132238]  ? _raw_spin_unlock+0xe/0x30
> [  247.134300]  ? finish_task_switch.isra.0+0x97/0x2a0
> [  247.136675]  xfs_log_force+0x7a/0x230 [xfs]
> [  247.139251]  xfsaild_push+0x2d7/0x850 [xfs]
> [  247.141753]  ? __timer_delete_sync+0x2b/0x40
> [  247.143931]  ? schedule_timeout+0x99/0x160
> [  247.146023]  ? __pfx_process_timeout+0x10/0x10
> [  247.148238]  xfsaild+0xaf/0x140 [xfs]
> [  247.150566]  ? __pfx_xfsaild+0x10/0x10 [xfs]
> [  247.153062]  kthread+0xd0/0x100
> [  247.154785]  ? __pfx_kthread+0x10/0x10
> [  247.156715]  ret_from_fork+0x31/0x50
> [  247.158404]  ? __pfx_kthread+0x10/0x10
> [  247.160147]  ret_from_fork_asm+0x1a/0x30
> [  247.161914]  </TASK>
> [  247.163238] INFO: task in:imjournal:787 blocked for more than 123
> seconds.
> [  247.165818]       Not tainted 6.11.0-rc5 #2
> [  247.167697] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this
> message.
> [  247.170408] task:in:imjournal    state:D stack:0     pid:787  
> tgid:745  
> ppid:1      flags:0x00000002
> [  247.173404] Call Trace:
> [  247.174673]  <TASK>
> [  247.175844]  __schedule+0x276/0x710
> [  247.177441]  schedule+0x27/0xa0
> [  247.178888]  schedule_timeout+0x14e/0x160
> [  247.180562]  ? update_sd_lb_stats.constprop.0+0x66/0x270
> [  247.182469]  ___down_common+0x110/0x170
> [  247.183983]  __down_common+0x1e/0xc0
> [  247.185435]  down+0x47/0x60
> [  247.186684]  xfs_buf_lock+0x31/0xe0 [xfs]
> [  247.188553]  xfs_buf_find_lock+0x59/0x110 [xfs]
> [  247.190532]  xfs_buf_lookup.constprop.0+0xb9/0x180 [xfs]
> [  247.192666]  xfs_buf_get_map+0xea/0x5f0 [xfs]
> [  247.194510]  xfs_buf_read_map+0x58/0x2a0 [xfs]
> [  247.196315]  ? xfs_btree_read_buf_block+0xa2/0x120 [xfs]
> [  247.198282]  ? xfs_trans_add_item+0x37/0xb0 [xfs]
> [  247.200066]  xfs_trans_read_buf_map+0x119/0x300 [xfs]
> [  247.201813]  ? xfs_btree_read_buf_block+0xa2/0x120 [xfs]
> [  247.203618]  xfs_btree_read_buf_block+0xa2/0x120 [xfs]
> [  247.205404]  xfs_btree_lookup_get_block.part.0+0x81/0x1b0 [xfs]
> [  247.207331]  xfs_btree_lookup+0x102/0x550 [xfs]
> [  247.208969]  ? kmem_cache_alloc_noprof+0xdc/0x2b0
> [  247.210404]  xfs_dialloc_ag_update_inobt+0x47/0x170 [xfs]
> [  247.212187]  ? xfs_inobt_init_cursor+0x67/0xa0 [xfs]
> [  247.213808]  xfs_dialloc_ag+0x1ad/0x2e0 [xfs]
> [  247.215343]  xfs_dialloc+0x24b/0x3b0 [xfs]
> [  247.216798]  xfs_create+0x167/0x490 [xfs]
> [  247.218193]  ? xfs_dir2_format+0x4a/0x130 [xfs]
> [  247.219644]  xfs_generic_create+0x30d/0x360 [xfs]
> [  247.221129]  lookup_open.isra.0+0x4e1/0x600
> [  247.222312]  open_last_lookups+0x139/0x440
> [  247.223461]  path_openat+0x88/0x290
> [  247.224451]  do_filp_open+0xae/0x150
> [  247.225454]  do_sys_openat2+0x96/0xd0
> [  247.226472]  __x64_sys_openat+0x57/0xa0
> [  247.227515]  do_syscall_64+0x64/0x170
> [  247.228524]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  247.229778] RIP: 0033:0x7fbcacb3e8d4
> [  247.230763] RSP: 002b:00007fbcab7fea40 EFLAGS: 00000293 ORIG_RAX:
> 0000000000000101
> [  247.232427] RAX: ffffffffffffffda RBX: 00007fbc9c000c10 RCX:
> 00007fbcacb3e8d4
> [  247.234030] RDX: 0000000000000241 RSI: 00007fbcab7feb80 RDI:
> 00000000ffffff9c
> [  247.235626] RBP: 00007fbcab7feb80 R08: 0000000000000000 R09:
> 0000000000000001
> [  247.237174] R10: 00000000000001b6 R11: 0000000000000293 R12:
> 0000000000000241
> [  247.238682] R13: 00007fbc9c000c10 R14: 0000000000000001 R15:
> 00007fbc9c015c40
> [  247.240204]  </TASK>
> [  247.242323] INFO: task updatedb:769 blocked for more than 123
> seconds.
> [  247.243697]       Not tainted 6.11.0-rc5 #2
> [  247.244694] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this
> message.
> [  247.246246] task:updatedb        state:D stack:0     pid:769  
> tgid:769  
> ppid:756    flags:0x00000002
> [  247.247995] Call Trace:
> [  247.248689]  <TASK>
> [  247.249342]  __schedule+0x276/0x710
> [  247.250192]  ? vp_notify+0x16/0x20 [virtio_pci]
> [  247.251208]  schedule+0x27/0xa0
> [  247.252006]  schedule_timeout+0x14e/0x160
> [  247.252926]  ? __blk_flush_plug+0xf5/0x150
> [  247.253854]  __wait_for_common+0x8f/0x1d0
> [  247.254726]  ? __pfx_schedule_timeout+0x10/0x10
> [  247.255678]  xfs_buf_iowait+0x1c/0xc0 [xfs]
> [  247.256797]  __xfs_buf_submit+0x131/0x1e0 [xfs]
> [  247.257933]  xfs_buf_read_map+0x11e/0x2a0 [xfs]
> [  247.259044]  ? xfs_da_read_buf+0xee/0x170 [xfs]
> [  247.260178]  xfs_trans_read_buf_map+0x119/0x300 [xfs]
> [  247.261306]  ? xfs_da_read_buf+0xee/0x170 [xfs]
> [  247.262375]  xfs_da_read_buf+0xee/0x170 [xfs]
> [  247.263430]  xfs_dir3_block_read+0x3c/0xf0 [xfs]
> [  247.264506]  xfs_dir2_block_lookup_int+0x4c/0x1d0 [xfs]
> [  247.265658]  ? xfs_bmap_last_offset+0x6b/0x110 [xfs]
> [  247.266737]  xfs_dir2_block_lookup+0x3b/0x130 [xfs]
> [  247.267795]  xfs_dir_lookup_args+0x3e/0x90 [xfs]
> [  247.268810]  xfs_dir_lookup+0x139/0x160 [xfs]
> [  247.269798]  xfs_lookup+0x94/0x160 [xfs]
> [  247.270722]  ? avc_has_perm_noaudit+0x6b/0xf0
> [  247.271570]  xfs_vn_lookup+0x78/0xb0 [xfs]
> [  247.272504]  __lookup_slow+0x81/0x130
> [  247.273242]  walk_component+0xe5/0x160
> [  247.273969]  path_lookupat+0x6e/0x1c0
> [  247.274685]  filename_lookup+0xcd/0x1c0
> [  247.275428]  ? selinux_inode_getattr+0x9a/0xc0
> [  247.276249]  ? terminate_walk+0x21/0x100
> [  247.277002]  ? xfs_vn_getattr+0x66/0x340 [xfs]
> [  247.277968]  ? cp_new_stat+0x14f/0x180
> [  247.278682]  vfs_statx+0x75/0xe0
> [  247.279332]  vfs_fstatat+0x6c/0xb0
> [  247.279989]  __do_sys_newfstatat+0x26/0x60
> [  247.280722]  ? __seccomp_filter+0x3d8/0x5b0
> [  247.281473]  ? syscall_trace_enter+0xff/0x190
> [  247.282270]  do_syscall_64+0x64/0x170
> [  247.282958]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  247.283805] RIP: 0033:0x7f57f593e13e
> [  247.284479] RSP: 002b:00007ffccb8f2bc8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000106
> [  247.285541] RAX: ffffffffffffffda RBX: 0000564a28d7a9f0 RCX:
> 00007f57f593e13e
> [  247.286569] RDX: 00007ffccb8f2c30 RSI: 0000564a28d88fb9 RDI:
> 00000000ffffff9c
> [  247.287595] RBP: 0000564a28d88fb9 R08: 0000000000000000 R09:
> 00005649ecdb34e0
> [  247.288601] R10: 0000000000000100 R11: 0000000000000246 R12:
> 0000000000000006
> [  247.289599] R13: 00007ffccb8f2e00 R14: 0000000000000006 R15:
> 0000000000000006
> 
> Note :
> it was verified this issue was not reproduced if host kernel returned
> to
> 6.10.0-rc7.
> 

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@...merspace.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ