lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 16 Aug 2022 13:25:35 +0200 From: Stefan Wahren <stefan.wahren@...e.com> To: Jan Kara <jack@...e.cz> Cc: linux-ext4@...r.kernel.org, Ojaswin Mujoo <ojaswin@...ux.ibm.com>, Harshad Shirwadkar <harshadshirwadkar@...il.com>, Theodore Ts'o <tytso@....edu>, Ritesh Harjani <riteshh@...ux.ibm.com>, linux-fsdevel@...r.kernel.org, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Geetika.Moolchandani1@....com, regressions@...ts.linux.dev, Florian Fainelli <f.fainelli@...il.com> Subject: Re: [Regression] ext4: changes to mb_optimize_scan cause issues on Raspberry Pi Hi Jan, Am 16.08.22 um 11:34 schrieb Jan Kara: > Hi Stefan! > > On Sat 06-08-22 11:50:28, Stefan Wahren wrote: >> Am 28.07.22 um 12:00 schrieb Jan Kara: >>> Hello! >>> >>> On Mon 18-07-22 15:29:47, Stefan Wahren wrote: >>>> i noticed that since Linux 5.18 (Linux 5.19-rc6 is still affected) i'm >>>> unable to run "rpi-update" without massive performance regression on my >>>> Raspberry Pi 4 (multi_v7_defconfig + CONFIG_ARM_LPAE). Using Linux 5.17 this >>>> tool successfully downloads the latest firmware (> 100 MB) on my development >>>> micro SD card (Kingston 16 GB Industrial) with a ext4 filesystem within ~ 1 >>>> min. The same scenario on Linux 5.18 shows the following symptoms: >>> Thanks for report and the bisection! >>>> - download takes endlessly much time and leads to an abort by userspace in >>>> most cases because of the poor performance >>>> - massive system load during download even after download has been aborted >>>> (heartbeat LED goes wild) >>> OK, is it that the CPU is busy or are we waiting on the storage card? >>> Observing top(1) for a while should be enough to get the idea. (sorry, I'm >>> not very familiar with RPi so I'm not sure what heartbeat LED shows). >> My description wasn't precise. I mean the green ACT LED, which uses the LED >> heartbeat trigger: >> >> "This allows LEDs to be controlled by a CPU load average. The flash >> frequency is a hyperbolic function of the 1-minute load average." >> >> I'm not sure if it's CPU or IO driven load, here the top output in bad case: >> >> top - 08:44:17 up 43 min, 2 users, load average: 5,02, 5,45, 5,17 >> Tasks: 142 total, 1 running, 141 sleeping, 0 stopped, 0 zombie >> %Cpu(s): 0,4 us, 0,4 sy, 0,0 ni, 49,0 id, 50,2 wa, 0,0 hi, 0,0 si, 0,0 >> st >> MiB Mem : 7941,7 total, 4563,1 free, 312,7 used, 3066,0 buff/cache >> MiB Swap: 100,0 total, 100,0 free, 0,0 used. 7359,6 avail Mem > OK, there's plenty of memory available, CPUs are mostly idle, the load is > likely created by tasks waiting for IO (which also contribute to load > despite not consuming CPU). Not much surprising here. > >>> Can you run "iostat -x 1" while the download is running so that we can see >>> roughly how the IO pattern looks? >>> >> Here the output during download: >> >> Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm >> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util >> mmcblk1 0,00 2,00 0,00 36,00 0,00 0,00 0,00 >> 0,00 0,00 23189,50 46,38 0,00 18,00 500,00 100,00 >> >> avg-cpu: %user %nice %system %iowait %steal %idle >> 0,25 0,00 0,00 49,62 0,00 50,13 >> >> Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm >> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util >> mmcblk1 0,00 2,00 0,00 76,00 0,00 0,00 0,00 >> 0,00 0,00 46208,50 92,42 0,00 38,00 500,00 100,00 >> >> avg-cpu: %user %nice %system %iowait %steal %idle >> 0,25 0,00 0,00 49,62 0,00 50,13 >> >> Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm >> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util >> mmcblk1 0,00 3,00 0,00 76,00 0,00 0,00 0,00 >> 0,00 0,00 48521,67 145,56 0,00 25,33 333,33 100,00 >> >> avg-cpu: %user %nice %system %iowait %steal %idle >> 0,25 0,00 0,00 49,62 0,00 50,13 > So this is interesting. We can see the card is 100% busy. The IO submitted > to the card is formed by small requests - 18-38 KB per request - and each > request takes 0.3-0.5s to complete. So the resulting throughput is horrible > - only tens of KB/s. Also we can see there are many IOs queued for the > device in parallel (aqu-sz columnt). This does not look like load I would > expect to be generated by download of a large file from the web. > > You have mentioned in previous emails that with dd(1) you can do couple > MB/s writing to this card which is far more than these tens of KB/s. So the > file download must be doing something which really destroys the IO pattern > (and with mb_optimize_scan=0 ext4 happened to be better dealing with it and > generating better IO pattern). Can you perhaps strace the process doing the > download (or perhaps strace -f the whole rpi-update process) so that we can > see how does the load generated on the filesystem look like? Thanks! i can do that. But may be the sources of rpi-update is more helpful? https://github.com/raspberrypi/rpi-update/blob/master/rpi-update > > Honza
Powered by blists - more mailing lists