lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a3eb48d-6ca1-61c6-20cf-ba2fbda21f45@nvidia.com>
Date:   Tue, 11 Aug 2020 04:19:47 -0700
From:   John Hubbard <jhubbard@...dia.com>
To:     Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, <linux-btrfs@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
CC:     linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: btrfs crash in kobject_del while running xfstest

Somehow the copy-paste of Chris Mason's name failed (user error
on my end), sorry about that Chris!

On 8/11/20 4:17 AM, John Hubbard wrote:
> Hi,
> 
> Here's an early warning of a possible problem.
> 
> I'm seeing a new btrfs crash when running xfstests, as of
> 00e4db51259a5f936fec1424b884f029479d3981 ("Merge tag
> 'perf-tools-2020-08-10' of
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux") in linux.git.
> 
> This doesn't crash in v5.8, so I attempted to bisect, but ended up with
> the net-next merge commit as the offending one: commit
> 47ec5303d73ea344e84f46660fff693c57641386 ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next"), which
> doesn't really help because it's 2088 files changed, of course.
> 
> I'm attaching the .config that I used.
> 
> This is easily reproducible via something like (change to match your setup,
> of course):
> 
>      sudo TEST_DEV=/dev/nvme0n1p8 TEST_DIR=/xfstest_btrfs \
>        SCRATCH_DEV=/dev/nvme0n1p9 SCRATCH_MNT=/xfstest_scratch  ./check \
>        btrfs/002
> 
> which leads to:
> 
> [  586.097360] BTRFS info (device nvme0n1p8): disk space caching is enabled
> [  586.103232] BTRFS info (device nvme0n1p8): has skinny extents
> [  586.115169] BTRFS info (device nvme0n1p8): enabling ssd optimizations
> [  586.308264] BTRFS: device fsid 5dfff89d-8f8d-42ac-8538-acb95164d0be devid 1 transid 5 
> /dev/nvme0n1p9 scanned by mkfs.btrfs (6374)
> [  586.342776] BTRFS info (device nvme0n1p9): disk space caching is enabled
> [  586.348585] BTRFS info (device nvme0n1p9): has skinny extents
> [  586.353413] BTRFS info (device nvme0n1p9): flagging fs with big metadata feature
> [  586.368129] BTRFS info (device nvme0n1p9): enabling ssd optimizations
> [  586.373996] BTRFS info (device nvme0n1p9): checking UUID tree
> [  586.387449] BUG: kernel NULL pointer dereference, address: 0000000000000018
> [  586.393485] #PF: supervisor read access in kernel mode
> [  586.397623] #PF: error_code(0x0000) - not-present page
> [  586.401763] PGD 0 P4D 0
> [  586.403219] Oops: 0000 [#1] SMP PTI
> [  586.405650] CPU: 1 PID: 6405 Comm: umount Not tainted 5.8.0-hubbard-github+ #171
> [  586.412118] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X99-UD3P-CF, BIOS 
> F1 02/10/2015
> [  586.421360] RIP: 0010:kobject_del+0x1/0x20
> [  586.424427] Code: 48 c7 43 18 00 00 00 00 5b 5d c3 c3 be 01 00 00 00 48 89 df e8 60 1b 00 00 eb 
> c9 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 <48> 8b 6f 18 e8 86 ff ff ff 48 89 ef 5d e9 cd fe ff 
> ff 66 66 2e 0f
> [  586.442644] RSP: 0018:ffffc90009ef7e08 EFLAGS: 00010246
> [  586.446914] RAX: 0000000000000000 RBX: ffff888896080000 RCX: 0000000000000006
> [  586.453149] RDX: ffff88888ee4b000 RSI: ffffffff82669a00 RDI: 0000000000000000
> [  586.459390] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
> [  586.465631] R10: 0000000000000001 R11: 0000000000000000 R12: ffff888896080000
> [  586.471866] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [  586.478106] FS:  00007f5595739c80(0000) GS:ffff88889fc40000(0000) knlGS:0000000000000000
> [  586.485325] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  586.490129] CR2: 0000000000000018 CR3: 0000000896d5a006 CR4: 00000000001706e0
> [  586.496372] Call Trace:
> [  586.497807]  btrfs_sysfs_del_qgroups+0xa5/0xe0 [btrfs]
> [  586.502017]  close_ctree+0x1c5/0x2b6 [btrfs]
> [  586.505307]  ? fsnotify_destroy_marks+0x24/0x124
> [  586.508948]  generic_shutdown_super+0x67/0x100
> [  586.512408]  kill_anon_super+0x14/0x30
> [  586.515159]  btrfs_kill_super+0x12/0x20 [btrfs]
> [  586.518704]  deactivate_locked_super+0x36/0x90
> [  586.522159]  cleanup_mnt+0x12d/0x190
> [  586.524720]  task_work_run+0x5c/0xa0
> [  586.527285]  exit_to_user_mode_loop+0xb9/0xc0
> [  586.530648]  exit_to_user_mode_prepare+0xab/0xe0
> [  586.534276]  syscall_exit_to_user_mode+0x17/0x50
> [  586.537908]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  586.541984] RIP: 0033:0x7f55959896fb
> [  586.544531] Code: 07 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 
> 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d 07 0c 00 f7 
> d8 64 89 01 48
> [  586.562775] RSP: 002b:00007fffcc431228 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [  586.569485] RAX: 0000000000000000 RBX: 00007f5595ab31e4 RCX: 00007f55959896fb
> [  586.575753] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005601fb16bb80
> [  586.582020] RBP: 00005601fb16b970 R08: 0000000000000000 R09: 00007fffcc42ffa0
> [  586.588278] R10: 00005601fb16c930 R11: 0000000000000246 R12: 00005601fb16bb80
> [  586.594534] R13: 0000000000000000 R14: 00005601fb16ba68 R15: 0000000000000000
> [  586.600805] Modules linked in: xfs rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace 
> fscache bpfilter dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support 
> x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul btrfs ghash_clmulni_intel aesni_intel 
> blake2b_generic crypto_simd xor cryptd zstd_compress glue_helper input_leds raid6_pq libcrc32c 
> lpc_ich i2c_i801 mfd_core mei_me i2c_smbus mei rpcrdma sunrpc ib_isert iscsi_target_mod ib_iser 
> libiscsi ib_srpt target_core_mod ib_srp ib_ipoib rdma_ucm ib_uverbs ib_umad sr_mod cdrom sd_mod 
> nouveau ahci libahci nvme crc32c_intel video e1000e led_class nvme_core libata t10_pi ttm mxm_wmi 
> wmi fuse
> [  586.661098] CR2: 0000000000000018
> [  586.663455] ---[ end trace 158f42d646f4715d ]---
> 
> A quick peek shows that this is crashing here:
> 
> void kobject_del(struct kobject *kobj)
> {
>      struct kobject *parent = kobj->parent; <---- CRASHES HERE with NULL kobj
> 
>      __kobject_del(kobj);
>      kobject_put(parent);
> }
> EXPORT_SYMBOL(kobject_del);
> 
> The crash at 0x18 matches passes in a null, because that's the right offset for
> ->parent, and the disassembly confirms that 0x18 gets offset right at kobject_del+0x1:
> 
> Dump of assembler code for function kobject_del:
>     0xffffffff81534ec0 <+0>:     push   %rbp
>     0xffffffff81534ec1 <+1>:     mov    0x18(%rdi),%rbp
>     0xffffffff81534ec5 <+5>:     callq  0xffffffff81534e50 <__kobject_del>
>     0xffffffff81534eca <+10>:    mov    %rbp,%rdi
>     0xffffffff81534ecd <+13>:    pop    %rbp
>     0xffffffff81534ece <+14>:    jmpq   0xffffffff81534da0 <kobject_put>
> End of assembler dump.
> 
> But as for how we ended up with a null kobj here, that's actually hard to see, at least
> for a non-btrfs person, which is why I hoped git bisect would help more than it did here.
> 
> 
> thanks,

thanks,
-- 
John Hubbard
NVIDIA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ