lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afbfd186-1f06-a03c-79b5-c98a12d35c75@metrics.net>
Date:   Fri, 15 Jun 2018 13:49:24 -0400
From:   Anthony DeRobertis <aderobertis@...rics.net>
To:     carmark.dlut@...il.com
Cc:     David Howells <dhowells@...hat.com>, linux-cachefs@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Re: fscache: fix a kernel BUG at fs/fscache/operation.c:69!

On 02/22/2018 02:33 AM, carmark.dlut@...il.com wrote:
> From: Lei Xue <carmark.dlut@...il.com>
>
> There is a potential race in fscache operation enqueuing for reading and
> copying multiple pages from cachefiles to netfs.
> Under some heavy load system, it will happen very often.

Did anything happen with this patch? I still saw it with 4.16.12 (panic 
below, captured via netconsole) when running ffmpeg with a source on a 
cached NFSv3 share. (Actually an overlay on top of that cached NFS 
mount). Applying this patch seems to have fixed the issue.

The only weird thing I've noticed — not sure it's actually new from this 
patch — is that midway through the encode, it stopped using the cached 
version and started transferring it from the server again. But even so, 
that's far better than a panic that takes out the machine (until the 
watchdog forces a reset).

ps: please cc me on replies, not subscribed to the mailing list.

[51593.345396] FS-Cache:
[51593.345406] FS-Cache: Assertion failed
[51593.345409] FS-Cache: 0 > 0 is false
[51593.345421] ------------[ cut here ]------------
[51593.345424] kernel BUG at /build/linux-43CEzF/linux-4.16.12/fs/fscache/operation.c:68!
[51593.345430] invalid opcode: 0000 [#1] SMP PTI
[51593.345433] Modules linked in: netconsole configfs ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter overlay nfsv3 nfs tun cachefiles fscache binfmt_misc snd_usb_audio snd_hda_codec_hdmi snd_usbmidi_lib xts dm_crypt hid_generic usbhid snd_rawmidi snd_seq_device hid pktcdvd snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec bridge snd_hda_core stp llc intel_powerclamp snd_hwdep kvm_intel kvm snd_pcm_oss mxm_wmi snd_mixer_oss snd_pcm snd_timer irqbypass intel_cstate uhci_hcd intel_uncore r8169 pcspkr ehci_pci xhci_pci xhci_hcd wmi snd ehci_hcd i2c_i801 firewire_ohci sr_mod soundcore usbcore firewire_core crc_itu_t cdrom mii sg button asus_atk0110 lpc_ich shpchp usb_common i5500_temp i7core_edac nvidia_drm(PO) drm_kms_helper drm nvidia_modeset(PO) nvidia(PO)
[51593.345468]  ipmi_devintf ipmi_msghandler nfsd coretemp auth_rpcgss nfs_acl lockd grace sunrpc i2c_dev iTCO_wdt iTCO_vendor_support acpi_cpufreq msr loop ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb crypto_simd cryptd glue_helper aes_x86_64 dm_mod raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid0 multipath linear raid1 raid10 md_mod sd_mod ahci libahci crc32c_intel evdev serio_raw libata scsi_mod
[51593.345493] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P        W IO     4.16.0-2-amd64 #1 Debian 4.16.12-1
[51593.345495] Hardware name: System manufacturer System Product Name/SABERTOOTH X58, BIOS 1402    08/09/2012
[51593.345502] RIP: 0010:fscache_enqueue_operation+0x106/0x150 [fscache]
[51593.345504] RSP: 0018:ffff91a767383c98 EFLAGS: 00010082
[51593.345508] RAX: 0000000000000018 RBX: ffff91a7290d4d80 RCX: 0000000000000006
[51593.345510] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff91a767396730
[51593.345512] RBP: ffff91a3359f8a20 R08: 000000000000043e R09: 0000000000000000
[51593.345515] R10: ffff91a767383c80 R11: 0000000000000001 R12: ffff91a7290d4d80
[51593.345517] R13: ffff91a7290d4e08 R14: ffff91a703575e20 R15: ffff91a7290d4e08
[51593.345520] FS:  0000000000000000(0000) GS:ffff91a767380000(0000) knlGS:0000000000000000
[51593.345522] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51593.345524] CR2: 000055627b19b000 CR3: 000000015680a000 CR4: 00000000000006e0
[51593.345528] Call Trace:
[51593.345531]  <IRQ>
[51593.345536]  cachefiles_read_waiter+0xf8/0x150 [cachefiles]
[51593.345541]  __wake_up_common+0x76/0x170
[51593.345544]  wake_up_page_bit+0xac/0x100
[51593.345561]  mpage_end_io+0x30/0x90 [ext4]
[51593.345568]  clone_endio+0x8e/0x140 [dm_mod]
[51593.345572]  raid_end_bio_io+0x2c/0xb0 [raid10]
[51593.345576]  raid10_end_read_request+0x71/0x150 [raid10]
[51593.345581]  blk_update_request+0x78/0x2b0
[51593.345589]  scsi_end_request+0x2c/0x1e0 [scsi_mod]
[51593.345597]  scsi_io_completion+0x477/0x670 [scsi_mod]
[51593.345600]  blk_done_softirq+0xa0/0xd0
[51593.345605]  __do_softirq+0xde/0x2b4
[51593.345609]  irq_exit+0xae/0xb0
[51593.345612]  do_IRQ+0x7d/0xc0
[51593.345615]  common_interrupt+0xf/0xf
[51593.345617]  </IRQ>
[51593.345622] RIP: 0010:cpuidle_enter_state+0xa0/0x2b0
[51593.345624] RSP: 0018:ffffb3f1431c7eb0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
[51593.345627] RAX: ffff91a7673a18c0 RBX: 00002eec8305e1d8 RCX: 000000000000001f
[51593.345629] RDX: 00002eec8305e1d8 RSI: 000000002d9c529c RDI: 0000000000000000
[51593.345632] RBP: ffff91a7673a9948 R08: 0000000000000001 R09: 0000000000000135
[51593.345634] R10: ffffb3f1431c7e90 R11: 00000000000000a0 R12: 0000000000000003
[51593.345636] R13: ffffffff9cab2cd8 R14: 0000000000000000 R15: 00002eec83032f50
[51593.345641]  do_idle+0x183/0x1e0
[51593.345644]  cpu_startup_entry+0x6f/0x80
[51593.345649]  start_secondary+0x1a4/0x1f0
[51593.345652]  secondary_startup_64+0xa5/0xb0
[51593.345655] Code: 0b 48 c7 c7 63 d9 7d c1 e8 72 dd 2f da 48 c7 c7 71 d9 7d c1 e8 66 dd 2f da 48 63 73 44 31 d2 48 c7 c7 a0 ce 7d c1 e8 54 dd 2f da <0f> 0b 48 c7 c7 63 d9 7d c1 e8 46 dd 2f da 48 c7 c7 71 d9 7d c1
[51593.345682] RIP: fscache_enqueue_operation+0x106/0x150 [fscache] RSP: ffff91a767383c98
[51593.345688] ---[ end trace fe061122063cafae ]---
[51593.345690] Kernel panic - not syncing: Fatal exception in interrupt
[51593.345782] Kernel Offset: 0x1aa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[51593.345786] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
[51593.345793] WARNING: CPU: 6 PID: 0 at /build/linux-43CEzF/linux-4.16.12/kernel/sched/core.c:1190 set_task_cpu+0x14d/0x160
[51593.345795] Modules linked in: netconsole configfs ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter overlay nfsv3 nfs tun cachefiles fscache binfmt_misc snd_usb_audio snd_hda_codec_hdmi snd_usbmidi_lib xts dm_crypt hid_generic usbhid snd_rawmidi snd_seq_device hid pktcdvd snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec bridge snd_hda_core stp llc intel_powerclamp snd_hwdep kvm_intel kvm snd_pcm_oss mxm_wmi snd_mixer_oss snd_pcm snd_timer irqbypass intel_cstate uhci_hcd intel_uncore r8169 pcspkr ehci_pci xhci_pci xhci_hcd wmi snd ehci_hcd i2c_i801 firewire_ohci sr_mod soundcore usbcore firewire_core crc_itu_t cdrom mii sg button asus_atk0110 lpc_ich shpchp usb_common i5500_temp i7core_edac nvidia_drm(PO) drm_kms_helper drm nvidia_modeset(PO) nvidia(PO)
[51593.345823]  ipmi_devintf ipmi_msghandler nfsd coretemp auth_rpcgss nfs_acl lockd grace sunrpc i2c_dev iTCO_wdt iTCO_vendor_support acpi_cpufreq msr loop ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb crypto_simd cryptd glue_helper aes_x86_64 dm_mod raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid0 multipath linear raid1 raid10 md_mod sd_mod ahci libahci crc32c_intel evdev serio_raw libata scsi_mod
[51593.345844] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P      D W IO     4.16.0-2-amd64 #1 Debian 4.16.12-1
[51593.345846] Hardware name: System manufacturer System Product Name/SABERTOOTH X58, BIOS 1402    08/09/2012
[51593.345849] RIP: 0010:set_task_cpu+0x14d/0x160
[51593.345851] RSP: 0018:ffff91a7673837f0 EFLAGS: 00010006
[51593.345853] RAX: 0000000000000200 RBX: ffff91a74e31c380 RCX: 0000000000000000
[51593.345855] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91a74e31c380
[51593.345857] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000006
[51593.345859] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000
[51593.345861] R13: 0000000000000000 R14: 0000000000000046 R15: 00000000000218c0
[51593.345864] FS:  0000000000000000(0000) GS:ffff91a767380000(0000) knlGS:0000000000000000
[51593.345866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[51593.345868] CR2: 000055627b19b000 CR3: 000000015680a000 CR4: 00000000000006e0
[51593.345870] Call Trace:
[51593.345872]  <IRQ>
[51593.345875]  try_to_wake_up+0x154/0x460
[51593.345879]  autoremove_wake_function+0x11/0x50
[51593.345881]  __wake_up_common+0x76/0x170
[51593.345884]  __wake_up_common_lock+0x7c/0xc0
[51593.345888]  irq_work_run_list+0x4d/0x70
[51593.345891]  smp_irq_work_interrupt+0x32/0xc0
[51593.345894]  irq_work_interrupt+0xf/0x20
[51593.345898] RIP: 0010:panic+0x1fb/0x252
[51593.345900] RSP: 0018:ffff91a767383a48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff09
[51593.345903] RAX: 0000000000000041 RBX: 0000000000000000 RCX: 0000000000000006
[51593.345905] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff91a767396730
[51593.345908] RBP: ffff91a767383ac0 R08: 0000000000000472 R09: 0000000000000000
[51593.345910] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[51593.345912] R13: 0000000000000000 R14: ffffffff9c80922d R15: 0000000000000004
[51593.345915]  ? irq_work_interrupt+0xa/0x20
[51593.345919]  ? panic+0x1f7/0x252
[51593.345923]  oops_end+0xac/0xc0
[51593.345927]  do_trap+0x132/0x140
[51593.345930]  do_error_trap+0x9d/0x120
[51593.345934]  ? fscache_enqueue_operation+0x106/0x150 [fscache]
[51593.345938]  ? up+0x12/0x60
[51593.345941]  ? console_unlock+0x29d/0x550
[51593.345944]  invalid_op+0x1b/0x40
[51593.345949] RIP: 0010:fscache_enqueue_operation+0x106/0x150 [fscache]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ