linux-kernel - Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y4o9gV2v8eyI1arK@slm.duckdns.org>
Date:   Fri, 2 Dec 2022 08:01:37 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Barret Rhoden <brho@...gle.com>
Cc:     torvalds@...ux-foundation.org, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, ast@...nel.org,
        daniel@...earbox.net, andrii@...nel.org, martin.lau@...nel.org,
        joshdon@...gle.com, pjt@...gle.com, derkling@...gle.com,
        haoluo@...gle.com, dvernet@...a.com, dschatzberg@...a.com,
        dskarlat@...cmu.edu, riel@...riel.com,
        linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
        kernel-team@...a.com
Subject: Re: [PATCH 14/31] sched_ext: Implement BPF extensible scheduler class

Hello,

On Fri, Dec 02, 2022 at 12:08:27PM -0500, Barret Rhoden wrote:
> you might be able to avoid the double_lock_balance() by using
> move_queued_task(), which internally hands off the old rq lock and returns
> with the new rq lock.
> 
> the pattern for consume_dispatch_q() would be something like:
> 
> - kfunc from bpf, with this_rq lock held
> - notice p isn't on this_rq, goto remote_rq:
> - do sched_ext accounting, like the this_rq->dsq->nr--
> - unlock this_rq
> - p_rq = task_rq_lock(p)
> - double_check p->rq didn't change to this_rq during that unlock
> - new_rq = move_queued_task(p_rq, rf, p, new_cpu)
> - do sched_ext accounting like new_rq->dsq->nr++
> - unlock new_rq
> - relock the original this_rq
> - return to bpf
> 
> you still end up grabbing both locks, but just not at the same time.

Yeah, this probably would look better than the current double lock dancing,
especially in the finish_dispatch() path.

> plus, task_rq_lock() takes the guesswork out of whether you're getting p's
> rq lock or not.  it looks like you're using the holding_cpu to handle the
> race where p moves cpus after you read task_rq(p) but before you lock that
> task_rq.  maybe you can drop the whole concept of the holding_cpu?

->holding_cpu is there to basically detect intervening dequeues, so if we
lock them out with TASK_ON_RQ_MIGRATING, we might be able to drop it. I need
to look into it more tho. Things get pretty subtle around there, so I could
easily be missing something. I'll try this and let you know how it goes.

Thanks.

-- 
tejun