[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ADA482EF-F2FF-473A-9585-CD5925FA8BC1@oracle.com>
Date: Mon, 9 Dec 2024 20:36:54 +0000
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"tglx@...utronix.de"
<tglx@...utronix.de>,
Daniel Jordan <daniel.m.jordan@...cle.com>
Subject: Re: [RFC PATCH 0/4] Scheduler time slice extension
> On Nov 14, 2024, at 11:41 AM, Prakash Sangappa <prakash.sangappa@...cle.com> wrote:
>
>
>
>> On Nov 14, 2024, at 2:28 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>>
>> On Wed, Nov 13, 2024 at 08:10:52PM +0000, Prakash Sangappa wrote:
>>>
>>>
>>>> On Nov 13, 2024, at 11:36 AM, Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>>>>
>>>> On 2024-11-13 13:50, Peter Zijlstra wrote:
>>>>> On Wed, Nov 13, 2024 at 12:01:22AM +0000, Prakash Sangappa wrote:
>>>>>> This patch set implements the above mentioned 50us extension time as posted
>>>>>> by Peter. But instead of using restartable sequences as API to set the flag
>>>>>> to request the extension, this patch proposes a new API with use of a per
>>>>>> thread shared structure implementation described below. This shared structure
>>>>>> is accessible in both users pace and kernel. The user thread will set the
>>>>>> flag in this shared structure to request execution time extension.
>>>>> But why -- we already have rseq, glibc uses it by default. Why add yet
>>>>> another thing?
>>>>
>>>> Indeed, what I'm not seeing in this RFC patch series cover letter is an
>>>> explanation that justifies adding yet another per-thread memory area
>>>> shared between kernel and userspace when we have extensible rseq
>>>> already.
>>>
>>> It mainly provides pinned memory, can be useful for future use cases
>>> where updating user memory in kernel context can be fast or needs to
>>> avoid pagefaults.
>>
>> 'might be useful' it not good enough a justification. Also, I don't
>> think you actually need this.
>
> Will get back with database benchmark results using rseq API for scheduler time extension.
Sorry about the delay in response.
Here are the database swingbench numbers - includes results with use of rseq API.
Test results:
=========
Test system 2 socket AMD Genoa
Swingbench - standard database benchmark
Cached(database files on tmpfs) run, with 1000 clients.
Baseline(Without Sched time extension): 99K SQL exec/sec
With Sched time extension:
Shared structure API use: 153K SQL exec/sec (Previously reported)
55% improvement in throughput.
Restartable sequences API use: 147K SQL exec/sec
48% improvement in throughput
While both show good performance benefit with scheduler time extension,
there is a 7% difference in throughput between Shared structure & Restartable sequences API.
Use of shared structure is faster.
>
>>
>> See:
>>
>> https://lkml.kernel.org/r/20220113233940.3608440-4-posk@google.com
>>
>> for a more elaborate scheme.
>>
>>>> Peter, was there anything fundamentally wrong with your approach based
>>>> on rseq ? https://lore.kernel.org/lkml/20231030132949.GA38123@noisy.programming.kicks-ass.net
>>>>
>>>> The main thing I wonder is whether loading the rseq delay resched flag
>>>> on return to userspace is too late in your patch. Also, I'm not sure it is
>>>> realistic to require that no system calls should be done within time extension
>>>> slice. If we have this scenario:
>>>
>>> I am also not sure if we need to prevent system calls in this scenario.
>>> Was that restriction mainly because of restartable sequence API implements it?
>>
>> No, the whole premise of delaying resched was because people think that
>> syscalls are too slow. If you do not think this, then you shouldn't be
>> using this.
>
> Agree.
>
> Thanks,
> -Prakash
Powered by blists - more mailing lists