[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17bf8f85-9a9c-4d7d-add7-cd92313f73f1@suse.com>
Date: Wed, 31 Dec 2025 14:35:31 +1030
From: Qu Wenruo <wqu@...e.com>
To: Daniel J Blueman <daniel@...ra.org>
Cc: David Sterba <dsterba@...e.com>, Chris Mason <clm@...com>,
Linux BTRFS <linux-btrfs@...r.kernel.org>, linux-crypto@...r.kernel.org,
Linux Kernel <linux-kernel@...r.kernel.org>, kasan-dev@...glegroups.com,
ryabinin.a.a@...il.com
Subject: Soft tag and inline kasan triggering NULL pointer dereference, but
not for hard tag and outline mode (was Re: [6.19-rc3] xxhash invalid access
during BTRFS mount)
在 2025/12/31 13:59, Daniel J Blueman 写道:
> On Tue, 30 Dec 2025 at 17:28, Qu Wenruo <wqu@...e.com> wrote:
>> 在 2025/12/30 19:26, Qu Wenruo 写道:
>>> 在 2025/12/30 18:02, Daniel J Blueman 写道:
>>>> When mounting a BTRFS filesystem on 6.19-rc3 on ARM64 using xxhash
>>>> checksumming and KASAN, I see invalid access:
>>>
>>> Mind to share the page size? As aarch64 has 3 different supported pages
>>> size (4K, 16K, 64K).
>>>
>>> I'll give it a try on that branch. Although on my rc1 based development
>>> branch it looks OK so far.
>>
>> Tried both 4K and 64K page size with KASAN enabled, all on 6.19-rc3 tag,
>> no reproduce on newly created fs with xxhash.
>>
>> My environment is aarch64 VM on Orion O6 board.
>>
>> The xxhash implementation is the same xxhash64-generic:
>>
>> [ 17.035933] BTRFS: device fsid 260364b9-d059-410c-92de-56243c346d6d
>> devid 1 transid 8 /dev/mapper/test-scratch1 (253:2) scanned by mount (629)
>> [ 17.038033] BTRFS info (device dm-2): first mount of filesystem
>> 260364b9-d059-410c-92de-56243c346d6d
>> [ 17.038645] BTRFS info (device dm-2): using xxhash64
>> (xxhash64-generic) checksum algorithm
>> [ 17.041303] BTRFS info (device dm-2): checking UUID tree
>> [ 17.041390] BTRFS info (device dm-2): turning on async discard
>> [ 17.041393] BTRFS info (device dm-2): enabling free space tree
>> [ 19.032109] BTRFS info (device dm-2): last unmount of filesystem
>> 260364b9-d059-410c-92de-56243c346d6d
>>
>> So there maybe something else involved, either related to the fs or the
>> hardware.
>
> Thanks for checking Wenruo!
>
> With KASAN_GENERIC or KASAN_HW_TAGS, I don't see "kasan:
> KernelAddressSanitizer initialized", so please ensure you are using
> KASAN_SW_TAGS, KASAN_OUTLINE and 4KB pages. Full config at
> https://gist.github.com/dblueman/cb4113f2cf880520081cf3f7c8dae13f
Thanks a lot for the detailed configs.
Unfortunately with that KASAN_SW_TAGS and KASAN_INLINE, the kernel can
no longer boot, will always crash at boot with the following call trace,
thus not even able to reach btrfs:
[ 3.938722]
==================================================================
[ 3.938739] BUG: KASAN: invalid-access in bpf_patch_insn_data+0x178/0x3b0
[ 3.938766] Write of size 6720 at addr 96ff80008024b120 by task systemd/1
[ 3.938772] Pointer tag: [96], memory tag: [08]
[ 3.938775]
[ 3.938791] CPU: 5 UID: 0 PID: 1 Comm: systemd Not tainted
6.19.0-rc3-custom #159 PREEMPT(voluntary)
[ 3.938801] Hardware name: QEMU KVM Virtual Machine, BIOS unknown
2/2/2022
[ 3.938805] Call trace:
[ 3.938808] show_stack+0x20/0x38 (C)
[ 3.938827] dump_stack_lvl+0x60/0x80
[ 3.938846] print_report+0x17c/0x488
[ 3.938860] kasan_report+0xbc/0x108
[ 3.938887] kasan_check_range+0x7c/0xa0
[ 3.938895] __asan_memmove+0x54/0x98
[ 3.938904] bpf_patch_insn_data+0x178/0x3b0
[ 3.938912] bpf_check+0x2720/0x49d8
[ 3.938920] bpf_prog_load+0xbd0/0x13e8
[ 3.938928] __sys_bpf+0xba0/0x2dc8
[ 3.938935] __arm64_sys_bpf+0x50/0x70
[ 3.938943] invoke_syscall.constprop.0+0x88/0x148
[ 3.938957] el0_svc_common.constprop.0+0x7c/0x148
[ 3.938964] do_el0_svc+0x38/0x50
[ 3.938970] el0_svc+0x3c/0x198
[ 3.938984] el0t_64_sync_handler+0xa0/0xe8
[ 3.938993] el0t_64_sync+0x198/0x1a0
[ 3.939001]
[ 3.939003] The buggy address belongs to a 2-page vmalloc region
starting at 0x96ff80008024b000 allocated at bpf_check+0x158/0x49d8
[ 3.939015] The buggy address belongs to the physical page:
[ 3.939026] page: refcount:1 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x10cede
[ 3.939035] flags: 0x2d600000000000(node=0|zone=2|kasantag=0xd6)
[ 3.939047] raw: 002d600000000000 0000000000000000 dead000000000122
0000000000000000
[ 3.939053] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
[ 3.939057] raw: 00000000000fffff 0000000000000000
[ 3.939060] page dumped because: kasan: bad access detected
[ 3.939064]
[ 3.939065] Memory state around the buggy address:
[ 3.939069] ffff80008024c900: 96 96 96 96 96 96 96 96 96 96 96 96 96
96 96 96
[ 3.939073] ffff80008024ca00: 96 96 96 96 96 96 96 96 96 96 96 96 96
96 96 96
[ 3.939076] >ffff80008024cb00: 08 08 08 08 08 08 fe fe fe fe fe fe fe
fe fe fe
[ 3.939079] ^
[ 3.939082] ffff80008024cc00: fe fe fe fe fe fe fe fe fe fe fe fe fe
fe fe fe
[ 3.939086] ffff80008024cd00: fe fe fe fe fe fe fe fe fe fe fe fe fe
fe fe fe
[ 3.939089]
==================================================================
[ 3.939107] Disabling lock debugging due to kernel taint
[ 3.939134] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000020
Considering this is only showing up in KASAN_SW_TAGS, not HW_TAGS or the
default generic mode, I'm wondering if this is a bug in KASAN itself.
Adding KASAN people to the thread, meanwhile I'll check more KASAN +
hardware combinations including x86_64 (since it's still 4K page size).
Thanks,
Qu
>
> Also ensure your mount options resolve similar to
> "rw,relatime,compress=zstd:3,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/".
>
> Failing that, let me know of any significant filesystem differences from:
> # btrfs inspect-internal dump-super /dev/nvme0n1p5
> superblock: bytenr=65536, device=/dev/nvme0n1p5
> ---------------------------------------------------------
> csum_type 1 (xxhash64)
> csum_size 8
> csum 0x97ec1a3695ae35d0 [match]
> bytenr 65536
> flags 0x1
> ( WRITTEN )
> magic _BHRfS_M [match]
> fsid f99f2753-0283-4f93-8f5d-7a9f59f148cc
> metadata_uuid 00000000-0000-0000-0000-000000000000
> label
> generation 34305
> root 586579968
> sys_array_size 129
> chunk_root_generation 33351
> root_level 0
> chunk_root 19357892608
> chunk_root_level 0
> log_root 0
> log_root_transid (deprecated) 0
> log_root_level 0
> total_bytes 83886080000
> bytes_used 14462930944
> sectorsize 4096
> nodesize 16384
> leafsize (deprecated) 16384
> stripesize 4096
> root_dir 6
> num_devices 1
> compat_flags 0x0
> compat_ro_flags 0x3
> ( FREE_SPACE_TREE |
> FREE_SPACE_TREE_VALID )
> incompat_flags 0x361
> ( MIXED_BACKREF |
> BIG_METADATA |
> EXTENDED_IREF |
> SKINNY_METADATA |
> NO_HOLES )
> cache_generation 0
> uuid_tree_generation 34305
> dev_item.uuid 86166b5f-2258-4ab9-aac6-0d0e37ffbdb6
> dev_item.fsid f99f2753-0283-4f93-8f5d-7a9f59f148cc [match]
> dev_item.type 0
> dev_item.total_bytes 83886080000
> dev_item.bytes_used 22624075776
> dev_item.io_align 4096
> dev_item.io_width 4096
> dev_item.sector_size 4096
> dev_item.devid 1
> dev_item.dev_group 0
> dev_item.seek_speed 0
> dev_item.bandwidth 0
> dev_item.generation 0
>
> Thanks,
> Dan
> --
> Daniel J Blueman
Powered by blists - more mailing lists