lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Jan 2022 09:52:30 -0800
From:   Peter Oskolkov <posk@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Peter Oskolkov <posk@...k.io>, mingo@...hat.com,
        tglx@...utronix.de, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-api@...r.kernel.org, x86@...nel.org,
        pjt@...gle.com, avagin@...gle.com, jannh@...gle.com,
        tdelisle@...terloo.ca
Subject: Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups

On Wed, Jan 19, 2022 at 1:00 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Jan 18, 2022 at 10:19:21AM -0800, Peter Oskolkov wrote:
>
> > =========== signals and the general approach
> >
> > My version of the patchset has all of these things working. What it
> > does not have,
> > compared to the new approach we are discussing here, is runqueues per server
> > and proper signal handling (and potential integration with proxy execution).
> >
> > Runqueues per server, in the LAZY mode, are easy to emulate in my patchset:
> > nothing prevents the userspace to partition workers among servers, and have
> > servers that "own" their workers to be pointed at by idle_server_tid_ptr.
> >
> > The only thing that is missing is proper treating of signals. But my patchset
> > does ensure a single running worker per server, had pagefaults and preemptions
> > sorted out, etc. Basically, everything works except signals. This patchet
> > has issues with pagefaults,
>
> Already fixed pagefaults per:
>
>   YeGvovSckivQnKX8@...ez.programming.kicks-ass.net

Could you, please, post an updated RFC when you have a chance? Thanks!

>
> > worker timeouts
>
> I still have no clear answer as to what you actually want there.
>
> > , worker-to-worker context
> > switches (do workers move runqueues when they context switch?), etc.
>
> Not in kernel, if they need to be migrated, userspace needs to do that.
>
> > And my patchset now actually looks smaller and simpler, on the kernel side,
> > that what this patchset is shaping up to be.
> >
> > What if I fix signals in my patchset? I think the way you deal with signals
> > will work in my approach equally well; I'll also use umcg_kick() to preempt
> > workers instead of sending them a signal.
> >
> > What do you think?
>
> I still absolutely hate how long you do page pinning, it *will* wreck
> things like CMA which are somewhat latency critical for silly things
> like Android camera apps and who knows what else.
>
> You've also forgotten about this:
>
>   YcWutpu7BDeG+dQ2@...ez.programming.kicks-ass.net
>
> That's not optional given how you're using page-pinning. Also, I think
> we need at least one direct access to the page after getting the pin in
> order to make it work.
>
> That also very much limits it to Anon pages.

I can use the same mm/page pinning strategy as you do. But then our
patchsets will be quite similar, I guess, with the difference being
server wakeups with RUNNING workers vs "lazy" idle_server_tid_ptr. So
OK, let's continue with your approach. If you could post a new RFC
with the memory/paging fixes in it, I'll then add worker timeouts, as
I outlined in a separate email ~ 30min ago, and continue with my
integration/testing.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ