[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8f5b9b3-383a-c267-9ee3-f1e0da5466fc@alu.unizg.hr>
Date: Wed, 23 Nov 2022 10:14:07 +0100
From: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
To: paulmck@...nel.org
Cc: Phillip Lougher <phillip@...ashfs.org.uk>,
LKML <linux-kernel@...r.kernel.org>, phillip.lougher@...il.com,
Thorsten Leemhuis <regressions@...mhuis.info>, elliott@....com
Subject: Re: BUG: BISECTED: in squashfs_xz_uncompress() (Was: RCU stalls in
squashfs_readahead())
On 22. 11. 2022. 03:07, Paul E. McKenney wrote:
>> I'm afraid that I would lose in Far Cry miserably if my cores
>> decided to all lock up for 21 secs. :-(
>
> Agreed, 21 seconds is an improvement over the earlier 60 seconds, but
> still a very long time. Me, I come from DYNIX/ptx, where the equivalent
> to the RCU CPU stall warning was 1.5 seconds. On the other hand, it
> is also the case that DYNIX/ptx had nowhere near the variety of drivers
> and subsystems, nor did it scale anywhere near as far as Linux does today.
>
> But you only need one CPU to lock up for 21 seconds to get an RCU CPU
> stall warning, not all of them. ;-)
I can recall an occasion or a couple of them where the entire X Window system
had been unresponsive for quite a number of seconds that sometimes made me reset
the Ubuntu box.
I have the good news: the patches did not apply because they were already applied
in the mainline tree:
mtodorov@...ac:~/linux/kernel/linux_stable_build_b$ git log | grep -C5 28b3ae426598
Signed-off-by: Borislav Petkov <bp@...e.de>
Reviewed-by: Kees Cook <keescook@...omium.org>
Acked-by: Florian Weimer <fweimer@...hat.com>
Link: https://lore.kernel.org/r/898932fe61db6a9d61bc2458fa2f6049f1ca9f5c.1652290558.git.luto@kernel.org
commit 28b3ae426598e722cf5d5ab9cc7038791b955a56
Author: Uladzislau Rezki <uladzislau.rezki@...y.com>
Date: Wed Feb 16 14:52:09 2022 +0100
rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
mtodorov@...ac:~/linux/kernel/linux_stable_build_b$ git log | grep -C5 1045a06724f3
Somehow kernel-doc complains here about strong markup, but
we really don't need the [] so just remove that.
Signed-off-by: Johannes Berg <johannes.berg@...el.com>
commit 1045a06724f322ed61f1ffb994427c7bdbe64647
Author: Christoph Hellwig <hch@....de>
Date: Wed Jun 29 17:01:02 2022 +0200
remove CONFIG_ANDROID
mtodorov@...ac:~/linux/kernel/linux_stable_build_b$
>> This is at present just the wishful thinking, as I lack your 30 years of
>> experience with the kernel and RCU update system. I am only beginning to realise
>> why it is more efficient than the traditional locking, and IMHO it should
>> avoid locking up cores instead of increasing the number of complaints.
>
> Just to set the record straight, RCU does not normally lock up any of
> the cores. Instead, RCU detects that cores have been locked up.
>
> Give or take the occasional bug in RCU, of course!
Currently, I cannot be the judge of that, for I can't seem to understand how the
magic of RCU works., how it is implemented. There's more homework to be done ;-)
>> But even if the Linux kernel source is magically "memory mapped" into my
>> mind, I still do not see how it could be done. My Linux kernel learning curve
>> had not yet got that up, but I have no doubts that it is designed by
>> Intelligent Designers who are very witty people, and not village idiots ;-)
>
> There is the school of thought that claims that the Linux kernel is
> driven by evolutionary forces rather than intelligent design. And as
> we all know, evolutionary forces are driven by random changes, which
> absolutely anyone could make.
Give or take the rate of improbability where a bunch of monkeys randomly typing
would produce a working Linux kernel source would be about a couple of working
sources in a space of 96^30,000,00 (something like 10^300,000,000), it is comparable
to the probability of random coming of the first simplest DNA into the existence
from the amino acid primordial soup.
(Not that many atoms in the Universe - 10^82, you'd need an awful lot of wasted
multiverses with no even single cell life and certainly no working Linux kernels.)
> And one approach is to take a less aggressive RCU CPU stall timeout,
> say reducing from 21 seconds to (say) 15 seconds instead of all the
> way down to 20 milliseconds. This could allow you to ease into the
> latency-reduction work.
>
> Alternatively, consider that response time is a property of the
> entire system plus the environment that it runs in. So I suspect that
> the Android folks are accompanying that 20-millisecond timeout with
> some restrictions on what the on-phone workloads are permitted to do.
> Maybe ask the Android guys what those restrictions are and loosen them
> slightly, again allowing you to ease into the latency-reduction work.
Good point.
> Sometimes an NMI does get the CPUs back on track. Sometimes the RCU CPU
> stall warning is a symptom of the CPU having gotten too old and failing.
> Most often, though, it is a sign of some sort of lockup, a too-long
> RCU read-side critical section, or as Robert Elliot noted, the lack of
> a cond_resched().
>
> But please keep in mind that cond_resched() helps only in kernels built
> with CONFIG_PREEMPTION=n.
I have bad news that 6.1-r6 is still affected with squashfs_xz_uncompress bug, despite having both of your fixes
(as visible in above command's output -- double checked):
[ 91.065659] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 6 jiffies s: 621 root: 0x8/.
[ 91.065694] rcu: blocking rcu_node structures (internal RCU debug):
[ 91.065704] Sending NMI from CPU 5 to CPUs 3:
[ 91.065721] NMI backtrace for cpu 3
[ 91.065730] CPU: 3 PID: 2829 Comm: snap-store Not tainted 6.1.0-rc6 #1
[ 91.065741] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
[ 91.065746] RIP: 0010:__asan_load4+0x0/0xa0
[ 91.065764] Code: 9e c0 84 c0 75 e1 5d c3 cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00
00 00 0f 1f 00 <55> 48 89 e5 48 8b 4d 08 48 83 ff fb 77 64 eb 0f 0f 1f 00 48 b8 00
[ 91.065771] RSP: 0000:ffff8881388ef140 EFLAGS: 00000246
[ 91.065779] RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffffffff9be992fd
[ 91.065785] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffff888125500004
[ 91.065789] RBP: ffff8881388ef1e0 R08: 0000000000000001 R09: ffffed1024aa0de8
[ 91.065794] R10: ffff888125506f39 R11: ffffed1024aa0de7 R12: 0000000001067db0
[ 91.065799] R13: ffff888125500000 R14: 00000000014fe803 R15: ffff888125502112
[ 91.065804] FS: 00007fdec50ab180(0000) GS:ffff888257180000(0000) knlGS:0000000000000000
[ 91.065810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 91.065815] CR2: 00007fdeb7cb6260 CR3: 000000011a436005 CR4: 0000000000770ee0
[ 91.065820] PKRU: 55555554
[ 91.065823] Call Trace:
[ 91.065826] <TASK>
[ 91.065829] ? lzma_main+0x37a/0x1260
[ 91.065845] lzma2_lzma+0x2b9/0x430
[ 91.065857] xz_dec_lzma2_run+0x11f/0xb90
[ 91.065867] ? __asan_load4+0x55/0xa0
[ 91.065880] xz_dec_run+0x346/0x11f0
[ 91.065892] squashfs_xz_uncompress+0x196/0x370
[ 91.065905] ? lzo_uncompress+0x400/0x400
[ 91.065913] squashfs_decompress+0x88/0xd0
[ 91.065923] squashfs_read_data+0x1e5/0x900
[ 91.065930] ? __create_object+0x4ae/0x560
[ 91.065942] ? squashfs_bio_read.isra.3+0x230/0x230
[ 91.065951] ? __kasan_kmalloc+0xb6/0xc0
[ 91.065961] ? squashfs_page_actor_init_special+0x1a6/0x210
[ 91.065972] squashfs_readahead+0xaa3/0xe80
[ 91.065985] ? squashfs_fill_page+0x190/0x190
[ 91.065993] ? __filemap_add_folio+0x3a1/0x680
[ 91.066003] ? dio_warn_stale_pagecache.part.67+0x90/0x90
[ 91.066012] read_pages+0x122/0x540
[ 91.066023] ? file_ra_state_init+0x60/0x60
[ 91.066032] ? filemap_add_folio+0xd4/0x140
[ 91.066040] ? folio_alloc+0x1b/0x50
[ 91.066051] page_cache_ra_unbounded+0x1e6/0x280
[ 91.066064] do_page_cache_ra+0x7c/0x90
[ 91.066074] page_cache_ra_order+0x393/0x400
[ 91.066087] ondemand_readahead+0x2f1/0x4e0
[ 91.066098] page_cache_async_ra+0x8b/0xa0
[ 91.066106] filemap_fault+0x742/0x1490
[ 91.066113] ? __folio_memcg_unlock+0x35/0x80
[ 91.066124] ? read_cache_page_gfp+0x90/0x90
[ 91.066132] ? filemap_map_pages+0x28e/0xc60
[ 91.066145] __do_fault+0x76/0x1b0
[ 91.066154] do_fault+0x1c6/0x680
[ 91.066163] __handle_mm_fault+0x89a/0x1310
[ 91.066173] ? copy_page_range+0x1b20/0x1b20
[ 91.066181] ? mt_find+0x189/0x330
[ 91.066190] ? mas_next_entry+0xa80/0xa80
[ 91.066204] handle_mm_fault+0x11b/0x390
[ 91.066213] do_user_addr_fault+0x258/0x860
[ 91.066225] exc_page_fault+0x64/0xf0
[ 91.066235] asm_exc_page_fault+0x27/0x30
[ 91.066245] RIP: 0033:0x7fdeb7a1e541
[ 91.066252] Code: 11 44 b8 e0 0f 11 44 b8 f0 0f 11 04 b8 48 83 c7 40 48 83 c6 f8 75 92 48 85 d2 74 2d 4c 01 d7 49 8d 04 b9 48 83
c0 10 48 f7 da <0f> 28 05 18 7d 29 00 0f 1f 84 00 00 00 00 00 0f 11 40 f0 0f 11 00
[ 91.066259] RSP: 002b:00007fff46f77b60 EFLAGS: 00010293
[ 91.066265] RAX: 000055fbe1a57c30 RBX: 00007fff46f77d18 RCX: 000055fbe1a57c20
[ 91.066270] RDX: fffffffffffffffb RSI: 0000000000000005 RDI: 0000000000000000
[ 91.066274] RBP: 000055fbe16563b0 R08: 0000000000000028 R09: 000055fbe1a57c20
[ 91.066279] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000002b
[ 91.066283] R13: 000055fbe1656408 R14: 00007fff46f77b80 R15: 0000000000000000
[ 91.066292] </TASK>
(This is apparently only visible in CONFIG_KASAN=y build.)
>> Yes, you guys do an amasing job of keeping 30 million lines of code organised
>> and making some sense. I will cut the smalltalk as I know you are a busy man.
>> If I make a progress to actually produce any patches fixing these lockups and
>> stalls, I will be sure to include you into CC: as you requested.
>
> Looking forward to seeing what you come up with!
There will have to be a lot of homework to catch up with to before I'd be able to do anything
sensible. :)
Thanks,
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
Powered by blists - more mailing lists