linux-kernel - Re: Re: [PATCH v2 5/7] sched: Implement shared runqueue in CFS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230713040519.GB260054@maniforge>
Date:   Wed, 12 Jul 2023 23:05:19 -0500
From:   David Vernet <void@...ifault.com>
To:     Abel Wu <wuyun.abel@...edance.com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, gautham.shenoy@....com,
        kprateek.nayak@....com, aaron.lu@...el.com, clm@...a.com,
        tj@...nel.org, roman.gushchin@...ux.dev, kernel-team@...a.com,
        linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH v2 5/7] sched: Implement shared runqueue in CFS

On Thu, Jul 13, 2023 at 11:43:50AM +0800, Abel Wu wrote:
> On 7/13/23 6:16 AM, David Vernet wrote:
> > On Wed, Jul 12, 2023 at 06:47:26PM +0800, Abel Wu wrote:
> > > > + *
> > > > + * HOW
> > > > + * ===
> > > > + *
> > > > + * An shared_runq is comprised of a list, and a spinlock for synchronization.
> > > > + * Given that the critical section for a shared_runq is typically a fast list
> > > > + * operation, and that the shared_runq is localized to a single LLC, the
> > > > + * spinlock will typically only be contended on workloads that do little else
> > > > + * other than hammer the runqueue.
> > > 
> > > Would there be scalability issues on large LLCs?
> > 
> > See the next patch in the series [0] where we shard the per-LLC shared
> > runqueues to avoid contention.
> > 
> > [0]: https://lore.kernel.org/lkml/20230710200342.358255-7-void@manifault.com/
> 
> Sorry, I should have read the cover letter more carefully. By sharding,
> the LLC is partitioned into several zones, hence contention is relieved.
> But sharding itself might be tricky. Making the SMT siblings not cross
> shards, as suggested by Peter, is generally a good thing. But I wonder
> if there is any workload might benefit from other sharding form.

IMO for now the best thing to do is keep things simple until we observe
it being a problem in practice. Or to at least plan to address it in a
follow-on patch set when we add support for a dynamic shard sizing.
Proper shard sizing is required to do optimal SMT placement anyways, and
in general will likely have more of an impact on performance.

> > > 
> > > > +
> > > > +	task_rq_unlock(src_rq, p, &src_rf);
> > > > +
> > > > +	raw_spin_rq_lock(rq);
> > > > +	rq_repin_lock(rq, rf);
> > > 
> > > By making it looks more ugly, we can save some cycles..
> > > 
> > > 	if (src_rq != rq) {
> > > 		task_rq_unlock(src_rq, p, &src_rf);
> > > 	} else {
> > > 		rq_unpin_lock(src_rq, src_rf);
> > > 		raw_spin_unlock_irqrestore(&p->pi_lock, src_rf.flags);
> > > 		rq_repin_lock(rq, rf);
> > > 	}
> 
> I forgot the repin part when src_rq != rq, but I'm sure you already got
> my point :)

Yep, I got your main point which was to avoid the extra dropping and
acquiring of the same rq spinlock. FYI for posterity / anyone else
reading, the missing repin isn't sufficient on its own for the src_rq !=
rq path. We also need to reacquire the rq lock.

Thanks again for the helpful reviews.

- David