lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsP-e5CjqCFztiym=zjQ=Z00uxYcQPFACEbTwjV=BHEdJQ@mail.gmail.com>
Date: Tue, 30 Apr 2024 04:00:07 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: Takashi Iwai <tiwai@...e.de>
Cc: Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, linux-sound@...r.kernel.org
Subject: Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592
 decrease 30% of gaming performance

On Wed, Apr 24, 2024 at 6:42 PM Takashi Iwai <tiwai@...e.de> wrote:
>
> That said, maybe the first thing you can try would be to check who is
> actually calling the corresponding function
> (snd_timer_close_locked()).  Put a debug print or a tracing hook to
> watch out for figuring out.  If the commit was really relevant, it
> must be called very frequently and concurrently, and I don't know
> really who does it except for dmix/dsnoop.
>

It's madness.
Yes I added printk to snd_timer_close_locked and saw that function
invoked only 3 times when the system booted.
And never invoked during benchmark.

> git diff
diff --git a/Makefile b/Makefile
index 40fb2ca6fe4c..3ecff79a23b8 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
 VERSION = 6
 PATCHLEVEL = 9
 SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc6-test-build
 NAME = Hurr durr I'ma ninja sloth

 # *DOCUMENTATION*
diff --git a/sound/core/timer.c b/sound/core/timer.c
index 4d2ee99c12a3..59d8e4698b0b 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -407,7 +407,7 @@ static void snd_timer_close_locked(struct
snd_timer_instance *timeri,
                                   struct device **card_devp_to_put)
 {
        struct snd_timer *timer = timeri->timer;
-
+       printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        if (timer) {
                guard(spinlock_irq)(&timer->lock);
                timeri->flags |= SNDRV_TIMER_IFLG_DEAD;

[Tue Apr 30 03:33:25 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX/TX
[Tue Apr 30 03:33:33 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:36 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM TTY layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM socket layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM ver 1.11
[Tue Apr 30 03:33:51 2024] systemd-journald[949]:
/var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
Journal file uses a different sequence number ID, rotating.
[Tue Apr 30 03:33:52 2024] rfkill: input handler enabled
[Tue Apr 30 03:33:55 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:55 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:57 2024] input: solaar-keyboard as
/devices/virtual/input/input18
[Tue Apr 30 03:34:08 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:34:11 2024] input: Noble FoKus Mystique (AVRCP) as
/devices/virtual/input/input19
[Tue Apr 30 03:34:23 2024] show_signal: 8 callbacks suppressed
[Tue Apr 30 03:34:23 2024] traps: gldriverquery[4464] general
protection fault ip:7ff83958c76f sp:7ffc464e0e00 error:0 in
libLLVM.so.18.1[7ff83920c000+3afd000]
[Tue Apr 30 03:34:25 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 5 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:34:26 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 7 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:35:27 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 11 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:36:28 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 19 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:39:33 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 35 times, consider switching to WQ_UNBOUND

I saw a similar picture with ftrace.
I had no experience with ftrace, so I am laying out everything here
for you to see if I am doing something wrong or not.

# echo "snd_timer_close_locked" > /sys/kernel/tracing/set_ftrace_filter

# echo "function" > /sys/kernel/tracing/current_tracer

# echo 1 > /sys/kernel/tracing/tracing_on

/*  Benchmark the game here  */

# echo 0 > /sys/kernel/tracing/tracing_on

# cat /sys/kernel/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 0/0   #P:32
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |

# cat /sys/kernel/tracing/set_ftrace_filter
snd_timer_close_locked [snd_timer]

Now I am tormented by the question: how can reverting code that is not
invoked directly during a benchmark affect the benchmark result so
much?
I'm afraid that I won't be able to figure this out on my own.

-- 
Best Regards,
Mike Gavrilov.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ