lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y0/+ieCymugrjheC@localhost.localdomain>
Date:   Wed, 19 Oct 2022 15:41:29 +0200
From:   Juri Lelli <juri.lelli@...hat.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Qais Yousef <qyousef@...alina.io>,
        Connor O'Brien <connoro@...gle.com>,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com,
        John Stultz <jstultz@...gle.com>,
        Qais Yousef <qais.yousef@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Will Deacon <will@...nel.org>,
        Waiman Long <longman@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>,
        "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH 00/11] Reviving the Proxy Execution Series

On 19/10/22 08:23, Joel Fernandes wrote:
> 
> 
> > On Oct 19, 2022, at 7:43 AM, Qais Yousef <qyousef@...alina.io> wrote:
> > 
> > On 10/17/22 02:23, Joel Fernandes wrote:
> > 
> >> I ran a test to check CFS time sharing. The accounting on top is confusing,
> >> but ftrace confirms the proxying happening.
> >> 
> >> Task A - pid 122
> >> Task B - pid 123
> >> Task C - pid 121
> >> Task D - pid 124
> >> 
> >> Here D and B just spin all the time. C is lock owner (in-kernel mutex) and
> >> spins all the time, while A blocks on the same in-kernel mutex and remains
> >> blocked.
> >> 
> >> Then I did "top -H" while the test was running which gives below output.
> >> The first column is PID, and the third-last column is CPU percentage.
> >> 
> >> Without PE:
> >>  121 root      20   0   99496   4   0 R  33.6   0.0   0:02.76 t  (task C)
> >>  123 root      20   0   99496   4   0 R  33.2   0.0   0:02.75 t  (task B)
> >>  124 root      20   0   99496   4   0 R  33.2   0.0   0:02.75 t  (task D)
> >> 
> >> With PE:
> >>  PID
> >>  122 root      20   0   99496   4   0 D  25.3   0.0   0:22.21 t  (task A)
> >>  121 root      20   0   99496   4   0 R  25.0   0.0   0:22.20 t  (task C)
> >>  123 root      20   0   99496   4   0 R  25.0   0.0   0:22.20 t  (task B)
> >>  124 root      20   0   99496   4   0 R  25.0   0.0   0:22.20 t  (task D)
> >> 
> >> With PE, I was expecting 2 threads with 25% and 1 thread with 50%. Instead I
> >> get 4 threads with 25% in the top. Ftrace confirms that the D-state task is
> >> in fact not running and proxying to the owner task so everything seems
> >> working correctly, but the accounting seems confusing, as in, it is confusing
> >> to see the D-state task task taking 25% CPU when it is obviously "sleeping".
> >> 
> >> Yeah, yeah, I know D is proxying for C (while being in the uninterruptible
> >> sleep state), so may be it is OK then, but I did want to bring this up :-)
> > 
> > I seem to remember Valentin raised similar issue about how userspace view can
> > get confusing/misleading:
> > 
> >    https://www.youtube.com/watch?v=UQNOT20aCEg&t=3h21m41s
> 
> Thanks for the pointer! Glad to see the consensus was that this is not
> acceptable.
> 
> I think we ought to write a patch to fix the accounting, for this
> series. I propose adding 2 new entries to proc/pid/stat which I think
> Juri was also sort of was alluding to:
> 
> 1. Donated time.
> 2. Proxied time.

Sounds like a useful addition, at least from a debugging point of view.

> User space can then add or subtract this, to calculate things
> correctly. Or just display them in new columns. I think it will also
> actually show how much the proxying is happening for a use case.

Guess we'll however need to be backward compatible with old userspace?
Probably reporting the owner as running while proxied (as in the
comparison case vs. rtmutexes Valentin showed).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ