lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIgUUhKWesDpM0BJ@xsang-OptiPlex-9020>
Date: Tue, 29 Jul 2025 08:22:42 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: Peter Zijlstra <peterz@...radead.org>, <oe-lkp@...ts.linux.dev>,
	<lkp@...el.com>, <linux-kernel@...r.kernel.org>, <x86@...nel.org>, "Sebastian
 Andrzej Siewior" <bigeasy@...utronix.de>, <linux-mm@...ck.org>,
	<ltp@...ts.linux.it>, <oliver.sang@...el.com>
Subject: Re: [tip:locking/futex] [futex]  56180dd20c:
 BUG:sleeping_function_called_from_invalid_context_at_kernel/nsproxy.c

hi, Thomas Gleixner,

On Fri, Jul 25, 2025 at 05:45:04PM +0200, Thomas Gleixner wrote:
> Oliver!
> 
> On Fri, Jul 25 2025 at 17:10, Oliver Sang wrote:
> > On Wed, Jul 23, 2025 at 07:22:43PM +0200, Thomas Gleixner wrote:
> >> > [  286.673775][   C97] BUG: sleeping function called from invalid context at kernel/nsproxy.c:233   <---- (1)
> >> > [  286.673784][   C97] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6748, name: oom03
> >> > [  286.673787][   C97] preempt_count: 7ffffffe, expected: 0
> >> 
> >> Ooops. That's a corrupted preempt counter, which has underflown twice.
> >> 
> >> Can you please enable CONFIG_DEBUG_PREEMPT, so we can see where this
> >> happens?
> >
> > after enable CONFIG_DEBUG_PREEMPT, the config is as attached
> > config-6.16.0-rc5-00002-g56180dd20c19
> 
> Thank you for trying, but I just realized that it tells us only when it
> underflows, but we don't see where the actual extra decrement happens
> before that.
> 
> Can you please enable CONFIG_PREEMPT_TRACER and add
> 
> 'trace_event=preemptirq:preempt_disable,preemptirq:preempt_enable ftrace_dump_on_oops'
> 
> to the kernel command line. The latter will dump the recorded
> preempt_enable/disable events and we can pinpoint the function which is
> responsible for that.

the config with CONFIG_PREEMPT_TRACER enabled is attached as
config-6.16.0-rc5-00002-g56180dd20c19

by this config, we still see random
WARNING:at_kernel/sched/core.c:#preempt_count_sub

=========================================================================================
compiler/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/x86_64-rhel-9.4-ltp_+CONFIG_PREEMPT_TRACER/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/mm-oom/ltp

a255b78d14324f8a 56180dd20c19e5b0fa34822997a
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :20          60%          12:20    dmesg.RIP:preempt_count_sub
           :20          60%          12:20    dmesg.WARNING:at_kernel/sched/core.c:#preempt_count_sub

but bot seems not be able to capture other new dmesg stats.
I attached one dmesg as example dmesg-CONFIG_PREEMPT_TRACER.xz

since there are other 11 instances, if there should be some pattern for the dump
you mentioned, could you tell me? I will search in other instances if nothing
in the dmesg I attached. thanks


Decompressing Linux... No EFI environment detected.
Parsing ELF... No relocation needed... done.
Booting the kernel (entry_offset: 0x0000000003103444).
[    0.000000][    T0] Linux version 6.16.0-rc5-00002-g56180dd20c19 (kbuild@...d51ef74a4) (gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Mon Jul 28 14:54:17 CST 2025
[    0.000000][    T0] Command line: ip=::::lkp-skl-fpga01::dhcp root=/dev/ram0 RESULT_ROOT=/result/ltp/mm-oom/lkp-skl-fpga01/debian-12-x86_64-20240206.cgz/x86_64-rhel-9.4-ltp_+CONFIG_PREEMPT_TRACER/gcc-12/56180dd20c19e5b0fa34822997a9ac66b517e7b3/16 BOOT_IMAGE=/pkg/linux/x86_64-rhel-9.4-ltp_+CONFIG_PREEMPT_TRACER/gcc-12/56180dd20c19e5b0fa34822997a9ac66b517e7b3/vmlinuz-6.16.0-rc5-00002-g56180dd20c19 branch=tip/locking/futex job=/lkp/jobs/scheduled/lkp-skl-fpga01/ltp-mm-oom-debian-12-x86_64-20240206.cgz-56180dd20c19-20250728-4057047-16v2vnz-7.yaml user=lkp ARCH=x86_64 kconfig=x86_64-rhel-9.4-ltp_+CONFIG_PREEMPT_TRACER commit=56180dd20c19e5b0fa34822997a9ac66b517e7b3 trace_event=preemptirq:preempt_disable,preemptirq:preempt_enable ftrace_dump_on_oops ....


....


[  409.832569][ T7668] ------------[ cut here ]------------
[  409.855676][ T7668] DEBUG_LOCKS_WARN_ON(val > preempt_count())
[  409.855687][ T7668] WARNING: CPU: 54 PID: 7668 at kernel/sched/core.c:5903 preempt_count_sub+0xdf/0x1b0
[  409.871916][ T7668] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common skx_edac skx_edac_common nfit libnvdimm btrfs blake2b_generic xor zstd_compress raid6_pq x86_pkg_temp_thermal intel_powerclamp sd_mod sg coretemp kvm_intel irdma kvm ice irqbypass snd_pcm ghash_clmulni_intel gnss snd_timer ast rapl ib_uverbs snd ahci acpi_power_meter drm_client_lib intel_cstate ipmi_ssif libahci nvme mei_me soundcore drm_shmem_helper i2c_i801 intel_uncore ioatdma ib_core nvme_core pcspkr libata mei drm_kms_helper i2c_smbus lpc_ich intel_pch_thermal wmi dca ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler acpi_pad joydev binfmt_misc drm fuse loop dm_mod ip_tables
[  409.936794][ T7668] CPU: 54 UID: 0 PID: 7668 Comm: oom05 Not tainted 6.16.0-rc5-00002-g56180dd20c19 #1 PREEMPT(voluntary) 
[  409.948414][ T7668] RIP: 0010:preempt_count_sub+0xdf/0x1b0
[  409.954478][ T7668] Code: 11 38 d0 7c 08 84 d2 0f 85 c4 00 00 00 8b 15 08 b6 e2 04 85 d2 75 b6 48 c7 c6 60 1e 2d 84 48 c7 c7 a0 1e 2d 84 e8 a1 f1 f2 ff <0f> 0b eb 9f 4c 8b 65 08 4c 89 e7 e8 11 4f 0e 00 85 c0 74 0a 31 ff
[  409.975094][ T7668] RSP: 0018:ffffc9003733f938 EFLAGS: 00010286
[  409.981634][ T7668] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[  409.990088][ T7668] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000001
[  409.998542][ T7668] RBP: ffffc9003733f948 R08: 0000000000000001 R09: ffffed12f53c5839
[  410.006983][ T7668] R10: ffff8897a9e2c1cb R11: 0000000000000001 R12: ffff888187f5a1c0
[  410.015428][ T7668] R13: ffffc9003733fbd0 R14: 00000000003d0f00 R15: 0000000000001df4
[  410.023855][ T7668] FS:  00007f33a7dbd740(0000) GS:ffff889822a6e000(0000) knlGS:0000000000000000
[  410.033269][ T7668] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  410.040336][ T7668] CR2: 00007f279d5ffd58 CR3: 000000016ccaa001 CR4: 00000000007726f0
[  410.048798][ T7668] PKRU: 55555554
[  410.052841][ T7668] Call Trace:
[  410.056636][ T7668]  <TASK>
[  410.060076][ T7668]  _raw_spin_unlock+0x19/0x70
[  410.065274][ T7668]  copy_process+0x4244/0x4ab0
[  410.070464][ T7668]  ? 0xffffffff81000000
[  410.075139][ T7668]  ? __pfx_copy_process+0x10/0x10
[  410.080681][ T7668]  ? _inline_copy_from_user+0x4f/0xb0
[  410.086572][ T7668]  ? copy_clone_args_from_user+0xff/0x670
[  410.092816][ T7668]  ? preempt_count_sub+0x109/0x1b0
[  410.098458][ T7668]  kernel_clone+0xb6/0x7b0
[  410.103406][ T7668]  ? __pfx_kernel_clone+0x10/0x10
[  410.108959][ T7668]  ? _copy_to_user+0x5c/0x70
[  410.114081][ T7668]  ? __x64_sys_rt_sigprocmask+0x183/0x230
[  410.120346][ T7668]  __do_sys_clone3+0x150/0x1b0
[  410.125648][ T7668]  ? __pfx___do_sys_clone3+0x10/0x10
[  410.131479][ T7668]  ? trace_event_buffer_commit+0x191/0x470
[  410.137824][ T7668]  ? trace_event_buffer_reserve+0x200/0x330
[  410.144248][ T7668]  ? trace_event_buffer_commit+0x191/0x470
[  410.150566][ T7668]  ? trace_event_raw_event_preemptirq_template+0x12c/0x1b0
[  410.158279][ T7668]  ? __pfx_trace_event_raw_event_preemptirq_template+0x10/0x10
[  410.166337][ T7668]  ? trace_event_raw_event_preemptirq_template+0x12c/0x1b0
[  410.174029][ T7668]  do_syscall_64+0x7f/0x2f0
[  410.179022][ T7668]  ? count_memcg_events+0x2a7/0x470
[  410.184693][ T7668]  ? count_memcg_events+0x2a7/0x470
[  410.190351][ T7668]  ? trace_preempt_on+0x8c/0xb0
[  410.195655][ T7668]  ? count_memcg_events+0x2a7/0x470
[  410.201305][ T7668]  ? preempt_count_sub+0x109/0x1b0
[  410.206857][ T7668]  ? count_memcg_events+0x2a7/0x470
[  410.212493][ T7668]  ? handle_mm_fault+0x3ff/0x6f0
[  410.217843][ T7668]  ? do_user_addr_fault+0x91f/0xef0
[  410.223455][ T7668]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  410.229747][ T7668] RIP: 0033:0x7f33a7ec9889
[  410.234550][ T7668] Code: 31 ed e9 44 ff ff ff e8 25 e9 00 00 0f 1f 44 00 00 b8 ea ff ff ff 48 85 ff 74 2c 48 85 d2 74 27 49 89 c8 b8 b3 01 00 00 0f 05 <48> 85 c0 7c 18 74 01 c3 31 ed 48 83 e4 f0 4c 89 c7 ff d2 48 89 c7
[  410.255034][ T7668] RSP: 002b:00007ffc670703f8 EFLAGS: 00000206 ORIG_RAX: 00000000000001b3
[  410.263824][ T7668] RAX: ffffffffffffffda RBX: 00007f33a7e48ef0 RCX: 00007f33a7ec9889
[  410.272185][ T7668] RDX: 00007f33a7e48ef0 RSI: 0000000000000058 RDI: 00007ffc67070440
[  410.280503][ T7668] RBP: 00007f279d5ff6c0 R08: 00007f279d5ff6c0 R09: 00007ffc67070537
[  410.288823][ T7668] R10: 0000000000000008 R11: 0000000000000206 R12: ffffffffffffff78
[  410.297149][ T7668] R13: 0000000000000000 R14: 00007ffc67070440 R15: 00007f279cdff000
[  410.305468][ T7668]  </TASK>
[  410.308836][ T7668] ---[ end trace 0000000000000000 ]---
[  410.337739][ T1354] oom.h:181: TPASS: victim signalled: (9) SIGKILL
[  410.337753][ T1354] 




> 
> Thanks,
> 
>         tglx
> 
> 
> 

View attachment "config-6.16.0-rc5-00002-g56180dd20c19" of type "text/plain" (247348 bytes)

Download attachment "dmesg-CONFIG_PREEMPT_TRACER.xz" of type "application/x-xz" (103904 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ