lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date: Mon, 03 Jun 2024 09:08:49 -0400
From: Genes Lists <lists@...ience.com>
To: linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org
Cc: tytso@....edu, adilger.kernel@...ger.ca, snitzer@...nel.org,
 song@...nel.org,  yukuai3@...wei.com, msakai@...hat.com, axboe@...nel.dk,
 mpatocka@...hat.com,  linan122@...wei.com
Subject: 6.9.3 stable :  filesystem tasks stalled

Machine up for 3 days running 6.9.3. 

Stalls happened while rsync server is writing to /mnt/lv_data which
uses ext4 on md raid6 using lvmcache. rsync data is pushed over network
from another machine.

The same rsync process runs twice a day and worked until the 3rd day.

logs attached: dmesg, journalctl -k, ver_linux, lsblk and lspci.

Aside: This looks different than the 6.9.3 phy deadlock I experienced
on a different machine [1].

Git bisect not practical given the pretty low frequency of this
occurrence.

The first few lines of the first 3 log entries have:

  <TASK> 
  ? __schedule+0x3cf/0x1510
  ? sysvec_apic_timer_interrupt+0xe/0x90
  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  ? xas_descend+0x2f/0xa0
  ? xas_load+0x41/0x50
  ? filemap_get_entry+0x72/0x140
  ? __filemap_get_folio+0x37/0x2e0
  ? __find_get_block+0x84/0x330
  ? bdev_getblk+0x22/0x70
  ? __ext4_get_inode_loc+0x132/0x4d0 [ext4
9f75ec11db44bef3511c7e45e58aac1eb2f9252d]
...

  <TASK>
  __schedule+0x3c7/0x1510
  rt_mutex_schedule+0x20/0x40
  rt_mutex_slowlock_block.constprop.0+0x40/0x170
  __rt_mutex_slowlock_locked.constprop.0+0xbd/0x2f0
  rt_mutex_lock+0x49/0x60
  rcu_boost_kthread+0xca/0x2f0
  ? __pfx_rcu_boost_kthread+0x10/0x10
...

  <TASK>
  __schedule+0x3c7/0x1510
  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  ? update_load_avg+0x7e/0x7b0
  schedule+0x27/0xf0
  percpu_ref_switch_to_atomic_sync+0x9b/0xf0
  ? __pfx_autoremove_wake_function+0x10/0x10
  set_in_sync+0x5c/0xc0 [md_mod
dee80e622cfc358bebfe77522144fac96dc4812e]
  md_check_recovery+0x26b/0x3e0 [md_mod
dee80e622cfc358bebfe77522144fac96dc4812e]
  raid5d+0x59/0x710 [raid456 d2a0fd36840a461fec669ba17b3965c20926b921]



[1] 
https://lore.kernel.org/lkml/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@sapience.com/

-- 
Gene


View attachment "lsblk.out" of type "text/plain" (3437 bytes)

View attachment "lspci.out" of type "text/plain" (2161 bytes)

View attachment "s6.dmesg" of type "text/plain" (130541 bytes)

View attachment "s6.journal" of type "text/plain" (830251 bytes)

View attachment "ver_linux.out" of type "text/plain" (2605 bytes)

View attachment "cpu.info" of type "text/plain" (1533 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ