linux-kernel - Re: Question about sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190430115551.GT2623@hirez.programming.kicks-ass.net>
Date:   Tue, 30 Apr 2019 13:55:51 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...ux.ibm.com>
Cc:     linux-kernel@...r.kernel.org, andrea.parri@...rulasolutions.com
Subject: Re: Question about sched_setaffinity()

On Tue, Apr 30, 2019 at 03:51:30AM -0700, Paul E. McKenney wrote:
> > Then I'm not entirely sure how we can return 0 and not run on the
> > expected CPU. If we look at __set_cpus_allowed_ptr(), the only paths out
> > to 0 are:
> > 
> >  - if the mask didn't change
> >  - if we already run inside the new mask
> >  - if we migrated ourself with the stop-task
> >  - if we're not in fact running
> > 
> > That last case should never trigger in your circumstances, since @p ==
> > current and current is obviously running. But for completeness, the
> > wakeup of @p would do the task placement in that case.
> 
> Are there some diagnostics I could add that would help track this down,
> be it my bug or yours?

Maybe limited function trace combined with the scheduling tracepoints
would give clue.

Trouble is, I forever forget how to set that up properly :/ Maybe
something along these lines:

$ trace-cmd record -p function_graph -g sched_setaffinity -g migration_cpu_stop -e
sched_migirate_task -e sched_switch -e sched_wakeup

Also useful would be:

echo 1 > /proc/sys/kernel/traceoff_on_warning

which ensures the trace stops the moment we find fail.