lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8c9eb87b-5623-730a-5cf6-72d831ef797a@alu.unizg.hr>
Date:   Mon, 21 Nov 2022 05:04:36 +0100
From:   Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To:     paulmck@...nel.org
Cc:     Phillip Lougher <phillip@...ashfs.org.uk>,
        LKML <linux-kernel@...r.kernel.org>, phillip.lougher@...il.com,
        Thorsten Leemhuis <regressions@...mhuis.info>, elliott@....com
Subject: Re: BUG: BISECTED: in squashfs_xz_uncompress() (Was: RCU stalls in
 squashfs_readahead())

On 20. 11. 2022. 20:21, Paul E. McKenney wrote:

>> And what about the Mr. Robert Elliott's observation about calling conf_recshed()?
>>
>>> How big can these readahead sizes be? Should one of the loops include
>>> cond_resched() calls?
>>
>> That is IMHO better than allowing 21000 milisecond stalls on a core (or more of them).
>>
>> I don't think it is correct to stay in kernel mode for more than an timer unit
>> without yielding the CPU. It creates stalls in multimedia and audio (chirps like on scratched
>> CD-ROMs). This is especially noticeable with a KASAN build.
>>
>> Since Firefox and most snaps are using squashfs as compressed ROFS, the Firefox appears
>> to perform poorer since snaps are introduced than Chrome.
>>
>> IMHO, if we want something like realtime and multimedia processing (which is the specific
>> area of my research), it seems that anything trying to hold processor for 21000 ms (21 secs)
>> is either buggy or deliberately malicious. 20 ms is quite enough of work for a threat
>> in one allotted timeslot.
>>
>> I do not agree with Mr. Lougher's observation that I am thrashing my laptop. I think that
>> a system has to endure stress and torture testing. I was raised on Digital MicroVAX systems
>> on Ultrix which compiled lab at a time in memory that would today sound funny. :)
> 
> I personally think that it would be great if you were to work to decrease
> the Linux kernel's latency.  Doing so would not be fixing a regression,
> but I personally would welcome it.  Others might have different opinions,
> but please do CC me on any resulting patches.
> 
> And I will see your MicroVAX and raise you a videogame written on a
> PDP-12 whose fastest instruction executed in 1.6 microseconds (-not-
> nanoseconds!).  ;-)

I'm afraid that I would lose in Far Cry miserably if my cores
decided to all lock up for 21 secs. :-(

> You can can see a couple of people playing the game on a PDP-12 in
> a computer museum: https://www.rcsri.org/collection/pdp-12/
> 
>> Besides, this is the very idea behind the MG-LRU algorithm commit, to test eviction of
>> memory pages in the system with heavy load and low on memory.
>>
>> I will probably test your commits, but now I have to do my own evening ritual, unwinding,
>> and knowledge and memory consolidation (called "sleep").
> 
> And yes, sleep is often one of the best debugging tools available.
> 
>> I appreciate your lots of commits on the kernel.org and I hope I do not sound like
>> I am thinking you are a village idiot :(
>>
>> I am trying to adhere to the Code of Conduct with mutual respect and politeness.
> 
> Skepticism is not necessarily a bad thing, especially given that I
> am not immune from errors and confusion.  Me, I just thought you were
> forcefully reporting the regression, so I forcefully pointed you at the
> fix for that regression.
> 
> Again, I have absolutely no objection to your improving the kernel's
> response time.

This is at present just the wishful thinking, as I lack your 30 years of
experience with the kernel and RCU update system. I am only beginning to realise
why it is more efficient than the traditional locking, and IMHO it should
avoid locking up cores instead of increasing the number of complaints.

But even if the Linux kernel source is magically "memory mapped" into my
mind, I still do not see how it could be done. My Linux kernel learning curve
had not yet got that up, but I have no doubts that it is designed by
Intelligent Designers who are very witty people, and not village idiots ;-)

>> I know that the Linux kernel is about 30 million lines by now, and by the security experts
>> we should expect 30,000 bugs in such a solid piece of written code (one per thousand of
>> lines). Only Mr. Thorsten mentioned 950 unresolved in the "open" list.
> 
> At least 30,000 bugs, of which we know of maybe 950.  ;-)

So I need no point in banning the kernel from screaming to logs that it had
core stalls that needed a physical NMI to recover from, or they would potentially
last much longer.

>> Knowing all of this is difficult, but I still believe in open source and open systems
>> interconnected.
> 
> If it was easy, where would be the challenge?

AFAIK, the point I was taught in life was obedience, not overcoming challenges.

>> Of course, I always remember a proverb "Who hath despised the day of the small beginnings?"
>>
>> Hope this helps. My $0.02.
> 
> I think we are good.   ;-)

Yes, you guys do an amasing job of keeping 30 million lines of code organised
and making some sense. I will cut the smalltalk as I know you are a busy man.
If I make a progress to actually produce any patches fixing these lockups and
stalls, I will be sure to include you into CC: as you requested.

Have a nice day!

Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
-- 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ