[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4984b4f5-7bc5-6109-2523-77265141b3d2@google.com>
Date: Wed, 14 Dec 2022 18:20:11 -0500
From: Barret Rhoden <brho@...gle.com>
To: Tejun Heo <tj@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Josh Don <joshdon@...gle.com>, torvalds@...ux-foundation.org,
mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
bristot@...hat.com, vschneid@...hat.com, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
pjt@...gle.com, derkling@...gle.com, haoluo@...gle.com,
dvernet@...a.com, dschatzberg@...a.com, dskarlat@...cmu.edu,
riel@...riel.com, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCHSET RFC] sched: Implement BPF extensible scheduler class
On 12/14/22 17:23, Tejun Heo wrote:
> Google guys probably have a lot to say here too and there may be many
> commonalties, but here's how things are on our end.
your email pretty much captures my experiences from the google side. in
fact, i think i'll save it for the next time someone asks me to
summarize the challenges with both kernel rollouts and testing changes
on workloads. =)
>> I was given to believe this was a fairly rapid process.
>
> Going back to the first phase where we're experimenting in a more controlled
> environment. Yes, that is a faster process but only in comparison to the
> second phase. Some controlled experiments, the faster ones, usually take
> several hours to obtain a meaningful result. It just takes a while for
> production workloads to start, jit-compile all the hot code paths, warm up
> caches and so on. Others, unfortunately, take a lot longer to ramp up to the
> degree whether it can be compared against production numbers. Some of the
> benchmarks stretch multiple days.
>
> With SCX, we can keep just keep hotswapping and tuning the scheduler
> behavior getting results in tens of minutes instead of multiple hours and
> without worrying about crashing the test machines
for testing sched policies on one of our bigger apps, the O(hours)
kernel reboot vs O(minutes) reload of a BPF scheduler is a pain. but
that's only for a single machine; it can be much worse on a full cluster.
full-cluster tests are a different beast. we are one of many groups
that want to do testing, and we have to reserve a time on their cluster.
but to change the kernel, it actually took us weeks to coordinate an
kernel change on the app's large testing cluster - essentially since we
were using an unqualified kernel, we 'blocked' all of the other testing.
> it's way easier and faster to have a running test environment setup and
> iterate through scheduling behavior changes without worrying about crashing
> the machine than having to cycle and re-setup test setup for each iteration.
i'm a newcomer to BPF, but for me the "interaction with live machine" is
a major BPF feature, both in SCX and also more broadly with the various
tracing tools and other BPF uses. (not to mention the per-workload or
per-machine customization that BPF enables, but that's a separate
discussion).
thanks,
barret
Powered by blists - more mailing lists