[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bug-217965-13602-0pfvbRXyV2@https.bugzilla.kernel.org/>
Date: Mon, 18 Mar 2024 04:45:26 +0000
From: bugzilla-daemon@...nel.org
To: linux-ext4@...r.kernel.org
Subject: [Bug 217965] ext4(?) regression since 6.5.0 on sata hdd
https://bugzilla.kernel.org/show_bug.cgi?id=217965
--- Comment #69 from Eyal Lebedinsky (bugzilla@...l.emu.id.au) ---
Seeing this last comment (#68), I wondered if the fix is included in my latest
kernel. Or do I have a different problem?
After just updating my fedora 38 to kernel 6.7.9 I did NOT do the remount and
repeated my original problematic test.
$ uname -s
Linux e7.eyal.emu.id.au 6.7.9-100.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Wed
Mar 6 19:31:16 UTC 2024 x86_64 GNU/Linux
$ mount|grep data1
/dev/md127 on /data1 type ext4 (rw,noatime,stripe=640)
I rsync a directory (262GB, 8,911,139 files) from an SSD to a raid6 (7x12TB =>
60TB fs). Both are ext4.
$ sudo time rsync -aHSK --stats --progress
/data/no-backup/old-backups/tapes /data1/no-backup/really-not/old-backups/
What I see is that once meminfo shows the limit (4GB) was reached, the kworker
kicks in at 100% CPU.
At that point iostat shows activity on the array dropping, from about 160MB/s
to very little (with an occasional burst of a few MB/s).
```
10:12:33 2024-03-18
10:12:33 Device tps kB_read/s kB_wrtn/s kB_dscd/s
kB_read kB_wrtn kB_dscd
[trimmed]
10:17:13 md127 0.00 0.00 0.00 0.00
0 0 0
10:17:23 md127 34.40 137.60 0.00 0.00
1376 0 0 <<< rsync starts
10:17:33 md127 2346.30 747.20 8638.00 0.00
7472 86380 0
10:17:43 md127 7067.10 431.60 133644.40 0.00
4316 1336444 0
10:17:53 md127 1692.80 578.80 7015.20 0.00
5788 70152 0
10:18:03 md127 2439.20 169.60 32071.20 0.00
1696 320712 0
10:18:13 md127 274.00 4.00 2242.00 0.00
40 22420 0
10:18:23 md127 3172.70 17.60 56828.00 0.00
176 568280 0
10:18:33 md127 416.20 0.80 1664.80 0.00
8 16648 0
10:18:43 md127 18.70 0.40 76.40 0.00
4 764 0
10:18:53 md127 6.50 0.00 30.80 0.00
0 308 0
10:19:03 md127 4.80 0.00 40.00 0.00
0 400 0
10:19:13 md127 5.70 0.00 63.60 0.00
0 636 0
10:19:23 md127 2.60 0.00 54.80 0.00
0 548 0
10:19:33 md127 7.40 0.00 243.20 0.00
0 2432 0
10:19:43 md127 5.20 0.00 75.60 0.00
0 756 0
10:19:53 md127 3.80 0.00 20.40 0.00
0 204 0
10:20:03 md127 2.00 0.00 13.20 0.00
0 132 0
10:20:13 md127 3.90 0.00 29.20 0.00
0 292 0
10:20:23 md127 3.80 0.00 19.60 0.00
0 196 0
At the same time meminfo shows:
2024-03-18 10:17:04 Dirty: 11220 kB Buffers: 829988 kB MemFree:
670576 kB
2024-03-18 10:17:14 Dirty: 10784 kB Buffers: 830016 kB MemFree:
631500 kB
2024-03-18 10:17:24 Dirty: 750616 kB Buffers: 875592 kB MemFree:
654236 kB
2024-03-18 10:17:34 Dirty: 2757048 kB Buffers: 972948 kB MemFree:
600636 kB
2024-03-18 10:17:44 Dirty: 2855196 kB Buffers: 1046736 kB MemFree:
551940 kB
2024-03-18 10:17:54 Dirty: 4104524 kB Buffers: 1127200 kB MemFree:
538136 kB
2024-03-18 10:18:04 Dirty: 4390504 kB Buffers: 1155588 kB MemFree:
600828 kB
2024-03-18 10:18:14 Dirty: 4518280 kB Buffers: 1161916 kB MemFree:
580176 kB
2024-03-18 10:18:24 Dirty: 4356952 kB Buffers: 1185872 kB MemFree:
543072 kB
2024-03-18 10:18:34 Dirty: 4559504 kB Buffers: 1196396 kB MemFree:
518872 kB
2024-03-18 10:18:44 Dirty: 4567212 kB Buffers: 1197060 kB MemFree:
606572 kB
2024-03-18 10:18:54 Dirty: 4567592 kB Buffers: 1197084 kB MemFree:
611440 kB
... and stays there until ...
I then killed the copy (14,296MB copied). The writing to the array remained
very low, the kernel thread stayed at 100%
and meminfo drained very slowly. Access to the array is now slow with some
hiccups.
2024-03-18 10:35:24 Dirty: 4484720 kB Buffers: 4984308 kB MemFree:
820532 kB <<< rsync killed
2024-03-18 10:35:34 Dirty: 4484436 kB Buffers: 4984348 kB MemFree:
851288 kB
2024-03-18 10:35:44 Dirty: 4483992 kB Buffers: 4984368 kB MemFree:
817516 kB
2024-03-18 10:35:54 Dirty: 4483780 kB Buffers: 4984400 kB MemFree:
803156 kB
2024-03-18 10:36:04 Dirty: 4483704 kB Buffers: 4984460 kB MemFree:
809956 kB
2024-03-18 10:36:14 Dirty: 4479416 kB Buffers: 4984496 kB MemFree:
832980 kB
2024-03-18 10:36:24 Dirty: 4474312 kB Buffers: 4984528 kB MemFree:
881464 kB
2024-03-18 10:36:34 Dirty: 4474260 kB Buffers: 4984568 kB MemFree:
840444 kB
2024-03-18 10:36:44 Dirty: 4474132 kB Buffers: 4984600 kB MemFree:
843524 kB
2024-03-18 10:36:54 Dirty: 4474292 kB Buffers: 4984640 kB MemFree:
841004 kB
2024-03-18 10:37:04 Dirty: 4474052 kB Buffers: 4984680 kB MemFree:
834148 kB
2024-03-18 10:37:14 Dirty: 4473688 kB Buffers: 4984712 kB MemFree:
853200 kB
2024-03-18 10:37:24 Dirty: 4473448 kB Buffers: 4984752 kB MemFree:
782540 kB
2024-03-18 10:37:34 Dirty: 4473288 kB Buffers: 4984776 kB MemFree:
786100 kB
2024-03-18 10:37:44 Dirty: 3871768 kB Buffers: 4984972 kB MemFree:
846020 kB
2024-03-18 10:37:54 Dirty: 3871612 kB Buffers: 4985020 kB MemFree:
826664 kB
2024-03-18 10:38:04 Dirty: 3871736 kB Buffers: 4985052 kB MemFree:
826084 kB
2024-03-18 10:38:14 Dirty: 3871184 kB Buffers: 4985100 kB MemFree:
876572 kB
2024-03-18 10:38:24 Dirty: 3870936 kB Buffers: 4985140 kB MemFree:
918944 kB
2024-03-18 10:38:34 Dirty: 3648080 kB Buffers: 4985256 kB MemFree:
901336 kB
2024-03-18 10:38:44 Dirty: 3556612 kB Buffers: 4985316 kB MemFree:
902532 kB
2024-03-18 10:38:54 Dirty: 3551636 kB Buffers: 4985364 kB MemFree:
837816 kB
2024-03-18 10:39:04 Dirty: 3551968 kB Buffers: 4985468 kB MemFree:
823392 kB
2024-03-18 10:39:14 Dirty: 2835648 kB Buffers: 4985656 kB MemFree:
629428 kB
...
2024-03-18 11:05:25 Dirty: 2737096 kB Buffers: 4993860 kB MemFree:
599424 kB <<< 30m later
2024-03-18 11:35:25 Dirty: 2573748 kB Buffers: 5001184 kB MemFree:
612288 kB <<< again
2024-03-18 12:05:26 Dirty: 2432572 kB Buffers: 5007704 kB MemFree:
663928 kB <<< again
2024-03-18 12:35:27 Dirty: 2145348 kB Buffers: 3707492 kB MemFree:
588464 kB <<< again
2024-03-18 13:05:27 Dirty: 2017848 kB Buffers: 3718936 kB MemFree:
585500 kB <<< again
2024-03-18 13:35:28 Dirty: 1822436 kB Buffers: 3746824 kB MemFree:
565560 kB <<< again
2024-03-18 14:05:29 Dirty: 1595088 kB Buffers: 3799124 kB MemFree:
544504 kB <<< again
2024-03-18 14:35:29 Dirty: 1498416 kB Buffers: 3816868 kB MemFree:
3883524 kB <<< again
2024-03-18 15:05:30 Dirty: 1387140 kB Buffers: 3835824 kB MemFree:
3266060 kB <<< again
...
2024-03-18 15:32:51 Dirty: 1284940 kB Buffers: 3850936 kB MemFree:
3088904 kB <<< finally
2024-03-18 15:33:01 Dirty: 933268 kB Buffers: 3851144 kB MemFree:
3098840 kB
2024-03-18 15:33:11 Dirty: 51956 kB Buffers: 3851248 kB MemFree:
3095456 kB
2024-03-18 15:33:21 Dirty: 51968 kB Buffers: 3851284 kB MemFree:
3059212 kB
2024-03-18 15:33:31 Dirty: 52032 kB Buffers: 3851308 kB MemFree:
3085352 kB
2024-03-18 15:33:41 Dirty: 172 kB Buffers: 3851336 kB MemFree:
3090912 kB
2024-03-18 15:33:51 Dirty: 64 kB Buffers: 3851368 kB MemFree:
3030584 kB
```
So over 5 hours to copy this small part (14GB of 262GB) of the data.
Is this expected? Is this a fundamental "feature" of ext4? Or of raid6?
When I did do "sudo mount -o remount,stripe=0 /data1" the copy progressed
nicely with good
writing speed.
I have logs of the progress of this test at 10s intervals.
Regards
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
Powered by blists - more mailing lists