lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0672f0b7-36f5-4322-80e6-2da0f24c101b@redhat.com>
Date: Wed, 8 May 2024 19:45:23 +0200
From: David Hildenbrand <david@...hat.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>,
 Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
 Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: 6.9/BUG: Bad page state in process kswapd0 pfn:d6e840

On 08.05.24 12:16, Mikhail Gavrilov wrote:
> On Mon, Mar 18, 2024 at 2:55 PM Mikhail Gavrilov
> <mikhail.v.gavrilov@...il.com> wrote:
>>
>> Hi,
>> Today I  saw for the first time "BUG: Bad page state in process
>> kswapd0  pfn:d6e840"
>>
>> Trace:
>> BUG: Bad page state in process kswapd0  pfn:d6e840
>> page: refcount:0 mapcount:0 mapping:000000007512f4f2 index:0x2796c2c7c
>> pfn:0xd6e840
>> aops:btree_aops ino:1
>> flags: 0x17ffffe0000008(uptodate|node=0|zone=2|lastcpupid=0x3fffff)
>> page_type: 0xffffffff()
>> raw: 0017ffffe0000008 dead000000000100 dead000000000122 ffff88826d0be4c0
>> raw: 00000002796c2c7c 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: non-NULL mapping
>> Modules linked in: uvcvideo uvc videobuf2_vmalloc videobuf2_memops
>> videobuf2_v4l2 videobuf2_common videodev rndis_host uas cdc_ether
>> usbnet usb_storage mii overlay tun uinput snd_seq_dummy snd_hrtimer
>> rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet
>> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
>> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
>> nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr bnep sunrpc
>> binfmt_misc snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi mc
>> amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi mt76x2u
>> mt7921e edac_mce_amd snd_hda_intel mt76x2_common mt7921_common
>> snd_intel_dspcfg mt76x02_usb snd_intel_sdw_acpi mt76_usb mt792x_lib
>> snd_hda_codec mt76x02_lib mt76_connac_lib btusb btrtl mt76
>> snd_hda_core btintel kvm_amd btbcm btmtk snd_hwdep mac80211 snd_seq
>> kvm vfat snd_seq_device bluetooth libarc4 fat irqbypass snd_pcm rapl
>> cfg80211 snd_timer wmi_bmof pcspkr snd i2c_piix4 k10temp rfkill
>> soundcore joydev
>>   apple_mfi_fastcharge gpio_amdpt gpio_generic loop nfnetlink zram
>> amdgpu hid_apple crct10dif_pclmul crc32_pclmul crc32c_intel
>> polyval_clmulni polyval_generic amdxcp i2c_algo_bit drm_ttm_helper ttm
>> ghash_clmulni_intel drm_exec gpu_sched drm_suballoc_helper
>> sha512_ssse3 nvme drm_buddy sha256_ssse3 sha1_ssse3 drm_display_helper
>> nvme_core sp5100_tco r8169 ccp cec realtek nvme_auth video wmi
>> ip6_tables ip_tables fuse
>> CPU: 17 PID: 268 Comm: kswapd0 Tainted: G        W    L    -------
>> ---  6.9.0-0.rc0.20240315gite5eb28f6d1af.8.fc41.x86_64+debug #1
>> Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I
>> EDGE WIFI (MS-7D73), BIOS 1.82 01/24/2024
>> Call Trace:
>>   <TASK>
>>   dump_stack_lvl+0xce/0xf0
>>   bad_page+0xd4/0x230
>>   ? __pfx_bad_page+0x10/0x10
>>   ? page_bad_reason+0x9d/0x1f0
>>   free_unref_page_prepare+0x80e/0xe00
>>   ? __pfx___mem_cgroup_uncharge_folios+0x10/0x10
>>   ? __pfx_lock_release+0x10/0x10
>>   free_unref_folios+0x26e/0x9c0
>>   ? _raw_spin_unlock_irq+0x28/0x60
>>   move_folios_to_lru+0xc0e/0xe80
>>   ? __pfx_move_folios_to_lru+0x10/0x10
>>   evict_folios+0xe5c/0x1610
>>   ? evict_folios+0x5f3/0x1610
>>   ? __pfx_lock_acquire+0x10/0x10
>>   ? __pfx_evict_folios+0x10/0x10
>>   ? rcu_is_watching+0x15/0xb0
>>   ? rcu_is_watching+0x15/0xb0
>>   ? __pfx_lock_acquire+0x10/0x10
>>   ? __pfx___might_resched+0x10/0x10
>>   ? mem_cgroup_get_nr_swap_pages+0x25/0x120
>>   try_to_shrink_lruvec+0x4d8/0x800
>>   ? rcu_is_watching+0x15/0xb0
>>   ? __pfx_try_to_shrink_lruvec+0x10/0x10
>>   ? lock_release+0x581/0xc60
>>   ? __pfx_lock_release+0x10/0x10
>>   shrink_one+0x37c/0x6f0
>>   shrink_node+0x1d60/0x3080
>>   ? shrink_node+0x1d47/0x3080
>>   ? shrink_node+0x1afa/0x3080
>>   ? __pfx_shrink_node+0x10/0x10
>>   ? pgdat_balanced+0x7b/0x1a0
>>   balance_pgdat+0x88b/0x1480
>>   ? rcu_is_watching+0x15/0xb0
>>   ? __pfx_balance_pgdat+0x10/0x10
>>   ? __switch_to+0x409/0xdd0
>>   ? __switch_to_asm+0x37/0x70
>>   ? __schedule+0x10cd/0x61d0
>>   ? __pfx_debug_object_free+0x10/0x10
>>   ? __try_to_del_timer_sync+0xe5/0x140
>>   ? __pfx_lock_release+0x10/0x10
>>   ? __pfx___might_resched+0x10/0x10
>>   ? set_pgdat_percpu_threshold+0x1c4/0x2f0
>>   ? __pfx_calculate_pressure_threshold+0x10/0x10
>>   kswapd+0x51d/0x910
>>   ? __pfx_kswapd+0x10/0x10
>>   ? __pfx_autoremove_wake_function+0x10/0x10
>>   ? lockdep_hardirqs_on+0x80/0x110
>>   ? __kthread_parkme+0xba/0x1f0
>>   ? __pfx_kswapd+0x10/0x10
>>   kthread+0x2ed/0x3c0
>>   ? _raw_spin_unlock_irq+0x28/0x60
>>   ? __pfx_kthread+0x10/0x10
>>   ret_from_fork+0x31/0x70
>>   ? __pfx_kthread+0x10/0x10
>>   ret_from_fork_asm+0x1a/0x30
>>   </TASK>
>>
>> Quick googling doesn't give a reassuring answer.
>> If it is really a hardware problem then it is unclear what is the culprit here.
>> The memory was checked a year ago by testmem86 and no errors were found.
>> Considering the absolute randomness of the appearance of this bug
>> message, it may be worth ignoring it, but an unpleasant aftertaste
>> remains.
>>
>> Machine spec: https://linux-hardware.org/?probe=24b7696f8a
>> I attached below full kernel log and build config.
>>
>> --
>> Best Regards,
>> Mike Gavrilov.
> 
> Sorry for writing again, but no one answered me.
> Of course, I checked my memory again and ensured that it was fine.
> I even changed three motherboards, but this error appears once a week.
> Every time the common phrase is "BUG: Bad page state in process" but
> the process can be different.
> kcompactd, kswapd, kworker, btrfs-transacti
> I don’t know if this will help, but usually this happens when I run
> the “docker pull” command and a huge container is being updated.

"page dumped because: non-NULL mapping"

Is the relevant bit. We are freeing a page, but page->mapping is not 
NULL. IIUC, it might happen under memory pressure when reclaiming memory.

It's weird that only you are seeing that, if it would be something 
"obvious" I would expect multiple reports :/

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ