[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkZrp=4zWvjE9_010TAG1T_crCbf9P64UzJABspgcrGPKg@mail.gmail.com>
Date: Mon, 27 Mar 2023 16:23:13 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Tejun Heo <tj@...nel.org>, Josef Bacik <josef@...icpanda.com>,
Jens Axboe <axboe@...nel.dk>,
Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Vasily Averin <vasily.averin@...ux.dev>,
cgroups@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
bpf@...r.kernel.org
Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the
percpu lock
On Fri, Mar 24, 2023 at 9:46 PM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Fri, Mar 24, 2023 at 9:37 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> > On Fri, Mar 24, 2023 at 9:31 PM Shakeel Butt <shakeelb@...gle.com> wrote:
> > >
> > > On Fri, Mar 24, 2023 at 7:18 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > > >
> > > [...]
> > > > Any ideas here are welcome!
> > > >
> > >
> > > Let's move forward. It seems like we are not going to reach an
> > > agreement on making cgroup_rstat_lock a non-irq lock. However there is
> > > agreement on the memcg code of not flushing in irq context and the
> > > cleanup Johannes has requested. Let's proceed with those for now. We
> > > can come back to cgroup_rstat_lock later if we still see issues in
> > > production.
> >
> > Even if we do not flush from irq context, we still flush from atomic
> > contexts that will currently hold the lock with irqs disabled
> > throughout the entire flush sequence. A primary purpose of this reason
> > is to avoid that.
> >
> > We can either:
> > (a) Proceed with the following approach of making cgroup_rstat_lock a
> > non-irq lock.
> > (b) Proceed with Tejun's suggestion of always releasing and
> > reacquiring the lock at CPU boundaries, even for atomic flushes (if
> > the spinlock needs a break ofc).
> > (c) Something else.
>
> (d) keep the status quo regarding cgroup_rstat_lock
> (e) decouple the discussion of cgroup_rstat_lock from the agreed
> improvements. Send the patches for the agreed ones and continue
> discussing cgroup_rstat_lock.
Ah, I lost sight of the fact that the rest of the patch series does
not strictly depend on this patch. I will respin the rest of the patch
series separately. Thanks, Shakeel.
Meanwhile, it would be useful to reach an agreement here to stop
acquiring the cgroup_rstat_lock for a long time with irq disabled in
atomic contexts.
Tejun, if having the lock be non-irq is a non-starter for you, I can
send a patch that instead gives up the lock and reacquires it at every
CPU boundary unconditionally -- or perhaps every N CPU boundaries to
avoid excessively releasing and reacquiring the lock.
Something like:
static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep)
{
...
for_each_possible_cpu(cpu) {
...
/* Always yield the at CPU boundaries to enable irqs */
spin_unlock_irq(&cgroup_rstat_lock);
/* if @may_sleep, play nice and yield if necessary */
if (may_sleep)
cond_resched();
spin_lock_irq(&cgroup_rstat_lock);
}
}
If you have other ideas to avoid disabling irq's for the entire flush
sequence I am also open to that.
Thanks!
Powered by blists - more mailing lists