lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <39998df7-8882-43ae-8c7e-936c24eb4041@app.fastmail.com>
Date:   Mon, 18 Sep 2023 20:21:11 -0700
From:   "Andy Lutomirski" <luto@...nel.org>
To:     "Ankur Arora" <ankur.a.arora@...cle.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
        linux-mm@...ck.org, "the arch/x86 maintainers" <x86@...nel.org>
Cc:     "Andrew Morton" <akpm@...ux-foundation.org>,
        "Borislav Petkov" <bp@...en8.de>,
        "Dave Hansen" <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>, "Ingo Molnar" <mingo@...hat.com>,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>, mgorman@...e.de,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "Steven Rostedt" <rostedt@...dmis.org>,
        "Thomas Gleixner" <tglx@...utronix.de>,
        "Jon Grimm" <jon.grimm@....com>, "Bharata B Rao" <bharata@....com>,
        raghavendra.kt@....com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com,
        "Linus Torvalds" <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

On Wed, Aug 30, 2023, at 11:49 AM, Ankur Arora wrote:
> On preempt_model_none() or preempt_model_voluntary() configurations
> rescheduling of kernel threads happens only when they allow it, and
> only at explicit preemption points, via calls to cond_resched() or
> similar.
>
> That leaves out contexts where it is not convenient to periodically
> call cond_resched() -- for instance when executing a potentially long
> running primitive (such as REP; STOSB.)
>

So I said this not too long ago in the context of Xen PV, but maybe it's time to ask it in general:

Why do we support anything other than full preempt?  I can think of two reasons, neither of which I think is very good:

1. Once upon a time, tracking preempt state was expensive.  But we fixed that.

2. Folklore suggests that there's a latency vs throughput tradeoff, and serious workloads, for some definition of serious, want throughput, so they should run without full preemption.

I think #2 is a bit silly.  If you want throughput, and you're busy waiting for a CPU that wants to run you, but it's not because it's running some low-priority non-preemptible thing (because preempt is set to none or volunary), you're not getting throughput.  If you want to get keep some I/O resource busy to get throughput, but you have excessive latency getting scheduled, you don't get throughput.

If the actual problem is that there's a workload that performs better when scheduling is delayed (which preempt=none and preempt=volunary do, essentialy at random), then maybe someone should identify that workload and fix the scheduler.

So maybe we should just very strongly encourage everyone to run with full preempt and simplify the kernel?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ