lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBwvZkDXfCBuWMe8@tpad>
Date:   Thu, 23 Mar 2023 07:52:22 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Christoph Lameter <cl@...ux.com>,
        Aaron Tomlin <atomlin@...mlin.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Russell King <linux@...linux.org.uk>,
        Huacai Chen <chenhuacai@...nel.org>,
        Heiko Carstens <hca@...ux.ibm.com>, x86@...nel.org,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v7 00/13] fold per-CPU vmstats remotely

On Thu, Mar 23, 2023 at 08:51:14AM +0100, Michal Hocko wrote:
> On Wed 22-03-23 11:20:55, Marcelo Tosatti wrote:
> > On Wed, Mar 22, 2023 at 02:35:20PM +0100, Michal Hocko wrote:
> [...]
> > > > "Performance details for the kworker interruption:
> > > > 
> > > > oslat   1094.456862: sys_mlock(start: 7f7ed0000b60, len: 1000)
> > > > oslat   1094.456971: workqueue_queue_work: ... function=vmstat_update ...
> > > > oslat   1094.456974: sched_switch: prev_comm=oslat ... ==> next_comm=kworker/5:1 ...
> > > > kworker 1094.456978: sched_switch: prev_comm=kworker/5:1 ==> next_comm=oslat ...
> > > > 
> > > > The example above shows an additional 7us for the
> > > > 
> > > >         oslat -> kworker -> oslat
> > > > 
> > > > switches. In the case of a virtualized CPU, and the vmstat_update
> > > > interruption in the host (of a qemu-kvm vcpu), the latency penalty
> > > > observed in the guest is higher than 50us, violating the acceptable
> > > > latency threshold for certain applications."
> > > 
> > > Yes, I have seen that but it doesn't really give a wider context to
> > > understand why those numbers matter.
> > 
> > OK.
> > 
> > "In the case of RAN, a MAC scheduler with TTI=1ms, this causes >100us
> > interruption observed in a guest (which is above the safety
> > threshold for this application)."
> > 
> > Is that OK?
> 
> This might be a sufficient information for somebody familiar with the
> matter (not me). So no, not enough. We need to hear a more complete
> story. 

Michal,

Please refer to 
https://www.diva-portal.org/smash/get/diva2:541460/FULLTEXT01.pdf

2.3 Channel Dependent Scheduling
The purpose of scheduling is to decide which terminal will transmit data on which set
of resource blocks with what transport format to use. The objective is to assign
resources to the terminal such that the quality of service (QoS) requirement is fulfilled.
Scheduling decision is taken every 1 ms by base station (termed as eNodeB) as the
same length of Transmission Time Interval (TTI) in LTE system.

In general:

https://en.wikipedia.org/wiki/Real-time_computing

Real-time computing (RTC) is the computer science term for hardware and
software systems subject to a "real-time constraint", for example from
event to system response.[1] Real-time programs must guarantee response
within specified time constraints, often referred to as "deadlines".[2]

Real-time responses are often understood to be in the order of
milliseconds, and sometimes microseconds. A system not specified as
operating in real time cannot usually guarantee a response within any
timeframe, although typical or expected response times may be given.
Real-time processing fails if not completed within a specified deadline
relative to an event; deadlines must always be met, regardless of system
load.

For example, for the MAC scheduler processing must occur every 1ms,
and a certain amount of computation takes place (and must finish before
the next 1ms timeframe). A > 50us latency spike as observed by cyclictest
is considered a "failure".


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ