linux-kernel - Re: A Comparison of printk between upstream and linux-rt-devel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZsjY0puWXKOoq2ed@debarbos-thinkpadt14sgen2i.remote.csb>
Date: Fri, 23 Aug 2024 14:45:38 -0400
From: Derek Barbosa <debarbos@...hat.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: pmaldek@...e.com, williams@...hat.com, john.ogness@...utronix.de,
	tglx@...utronix.de, linux-rt-users@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: A Comparison of printk between upstream and linux-rt-devel

Hi,

> > For the stock kernel, we observed an average of about 14,263 messages printed
> > within 30 seconds. The standard deviation here hovers at around 1.52 messages,
> > with a variance of 2.3 messages.
> > 
> > For the "new printk" kernel, we observed an average of about 12,693 messages
> > printed within 30 seconds. The standard devation here hovers at about 131.2  
> > messages, with a variance of 17211 messages. 
> > 
> > We observe a rough 12-13 percent overall delta between new printk and the
> > stock kernel in this test case, with a larger disparity in standard
> > deviation/variance.
> 
> I am a bit confused. You write here ~14k messages within 30secs for
> "stock" aka 6.11.0-0.rc3 and then ~12k messages for "new printk" aka
> from the latest RT tree. At the top you write "6.11.0-0.rc3 is slower on
> flush". This means something else I guess.

You are correct.

In the aforementioned test, I am purely measuring the amount of messages that
are printed out to the serial console (and are visible from monitoring said
ttyS0 console) within these parameters:

---- BEGIN test_script.sh ----

echo "Inserted at $(date +%m:%d:%T)" | tee -a /dev/ttyS0 && insmod ./kmod/kthread-noise.ko
sleep 30
echo "Removed at $(date +%m:%d:%T)" | tee - /dev/ttyS0 && rmmod kthread-noise

---- END test_script.sh ----

For the "stock" 6.11.0-0.rc3 kernel, we see 14k messages printed.

For the "new printk" aka Latest RT Tree kernel, we see 12k messages printed.

This measurement is different from the time elapsed from invoking rmmod, and
confirmation from the serial console that the worker test has finished with the
"Finishing worker test" message. I erroneously refered to this as "flushing."

To clarify, Test 3: "Pending Message Speed", is a measurement of total time
elapsed from insertion to confirmation, additionally, there is the quantitative
analysis of how long we take to print pending messages, and print the worker
confirmation message from the invocation of rmmod. 

This is where the stock (aka 6.11.0-0.rc3) kernel is slower, as mentioned in the
TLDR. While we do print around ~12-15% more messages in sheer throughput during
that inital 30 second window, the removal overhead, which comes from printing 
any pending messages, is much higher.

If you'd like, you can try the test and potentially garner a better
understanding of the behavior observed on the serial console. You will notice
that after fully saturating the buffers (which takes approximately 30 seconds),
invoking rmmod does not immediately stop the workers from printing. In the case
of the stock (aka 6.11.0-0.rc3) kernel, the command line will hang, and wait for
the remaining messages to be printed.

> Regarding your RCU stall: You stuff a _lot_ into the printk buffer. And
> then printk is forced print everything out in a single sitting. The
> whole printing job is done within a preempt_disable region so RCU has to
> wait until it is done. If printing of the whole buffer takes >60sec then
> you see your RCU stall.
> The current RT tree will do the printing from a kthread which remains
> preemptible because your printing threads don't print any errors/ panics
> for 30seconds. So RCU gets its turn.
> 
> Sebastian

This makes a lot more sense to me now. I figured I would raise this concern out
of sheer curiosity. Thanks for answering that question!

I hope my ramblings made some sense.

Cheers,