[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <MN2PR11MB45667C6E534F7944BFA77684DB550@MN2PR11MB4566.namprd11.prod.outlook.com>
Date: Thu, 27 Aug 2020 12:09:21 +0000
From: "James Scriven (jamscriv)" <jamscriv@...co.com>
To: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Performance issue with recently_deleted() /no_journal with huge
directories
Hi, I'm working on migrating a workload from kernel 2.6 to 4.18 (REHL6 to RHEL8).
The use case is a build farm that has a basic workflow of:
1) rm -rf a large directory tree (about 2M files ~ 200GB) to free some space
2) download and extract a large tarbar (about 2M files ~ 200GB)
3) perform a build in the extracted directory tree
Repeat...
We've being using an ext4 filesystem with no journal for maximum performance with great success. We're not very concerned about losing data, but do want some persistence, which is why we don't just use tmpfs for this. We'll keep a number of these large workspaces around as long as space permits, and delete the oldest (step 1) just before starting a new one (step 2).
When migrating to this newer kernel, we are seeing performance degradation when we expand the tar, which I suspect is caused by inode allocation trying to find an unused inode that has not been used too recently. Since we have 2M deleted inodes that *have* been recently deleted, every one of the new 2M inodes has to search through all 2M of the deleted ones (or something to that approximation - my full understanding of the ext4 code is limited).
The simple testcase below shows the issue. My question is, is this edge case already understood? Is there a good way to re-gain this lost performance? Adding a "sync + drop_caches", or a sufficiently long sleep, between steps 1 and 2 will work around the issue, but is not ideal.
# each iteration of the test case the number of recently deleted inodes increases and performance degrades.
$ uname -a
Linux sjc-asr-bm-470 4.18.0-147.3.1.el8_1.x86_64 #1 SMP Wed Nov 27 01:11:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
3
real 0m1.796s
user 0m0.041s
sys 0m1.528s
real 0m3.280s
user 0m0.035s
sys 0m3.235s
real 0m4.329s
user 0m0.035s
sys 0m4.279s
real 0m6.033s
user 0m0.032s
sys 0m5.988s
real 0m7.303s
user 0m0.041s
sys 0m7.246s
real 0m7.874s
user 0m0.032s
sys 0m7.826s
real 0m9.376s
user 0m0.036s
sys 0m9.323s
real 0m9.979s
user 0m0.052s
sys 0m9.910s
real 0m9.808s
user 0m0.037s
sys 0m9.749s
real 0m9.067s
user 0m0.038s
sys 0m9.011s
$ uname -a
Linux sjc-asr-bm-100 2.6.32-754.17.1.el6.x86_64 #1 SMP Thu Jun 20 11:47:12 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches; for x in {1..10}; do rm -rf dirtree; mkdir dirtree; time mkdir dirtree/{1..50000}; done
3
real 0m0.724s
user 0m0.031s
sys 0m0.693s
real 0m0.762s
user 0m0.041s
sys 0m0.721s
real 0m0.717s
user 0m0.043s
sys 0m0.674s
real 0m0.712s
user 0m0.037s
sys 0m0.675s
real 0m0.749s
user 0m0.036s
sys 0m0.712s
real 0m0.710s
user 0m0.040s
sys 0m0.670s
real 0m0.746s
user 0m0.038s
sys 0m0.707s
real 0m0.715s
user 0m0.034s
sys 0m0.680s
real 0m0.747s
user 0m0.040s
sys 0m0.707s
real 0m0.732s
user 0m0.042s
sys 0m0.690s
Powered by blists - more mailing lists