[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <70943eab978e4482b9fa4f68119bc8ea@amazon.com>
Date: Thu, 24 Aug 2023 03:52:04 +0000
From: "Lu, Davina" <davinalu@...zon.com>
To: Theodore Ts'o <tytso@....edu>
CC: "Bhatnagar, Rishabh" <risbhat@...zon.com>, Jan Kara <jack@...e.cz>,
"jack@...e.com" <jack@...e.com>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Park, SeongJae" <sjpark@...zon.com>
Subject: RE: Tasks stuck jbd2 for a long time
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> Thanks for the details. This is something that am interested in trying to potentially to merge, since for a sufficiently coversion-heavy workload (assuming the conversion is happening
> across multiple inodes, and not just a huge number of random writes into a single fallocated file), limiting the number of kernel threads to one CPU isn't always going to be the right thing.
>The reason why we had done this way was because at the time, the only choices that we had was between a single kernel thread, or spawning a kernel thread for every single CPU --
>which for a very high-core-count system, consumed a huge amount of system resources. This is no longer the case with the new Concurrency Managed Workqueue (cmwq), but we never
>did the experiment to make sure cmwq didn't have surprising gotchas.
Thank you for the detailed explanation.
> I won't have time to look at this before the next merge window, but what I'm hoping to look at is your patch at [2], with two changes:
> a) Drop the _WQ_ORDERED flag, since it is an internal flag.
> b) Just pass in 0 for max_active instead of "num_active_cpus() > 1 ?
> num_active_cpus() : 1", for two reasons. Num_active_cpus() doesn't
> take into account CPU hotplugs (for example, if you have a
> dynmically adjustable VM shape where the number of active CPU's
> might change over time). Is there a reason why we need to set that
> limit?
> Do you see any potential problem with these changes?
Sorry for the late response, after the internal discussion, I can continue on this patch. These 2 points are easy to change, I will also do some xfstest for EXT4 and run BMS on RDS environment to do a quick verify. We can change num_active_cpus() to 0. Why adding that: just because during fio test, the max active number goes to ~50 we won't see this issue. But this is not necessary. I will see what's Oleg's opinion later offline.
Thanks,
Davina
Powered by blists - more mailing lists