lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2024 17:17:48 +0200
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Mel Gorman <mgorman@...e.de>, 
	K Prateek Nayak <kprateek.nayak@....com>, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: softirq

On Wed, 26 Jun 2024 at 07:37, Julia Lawall <julia.lawall@...ia.fr> wrote:
>
> Hello,
>
> I'm not sure to understand how soft irqs work.  I see the code:
>
> open_softirq(SCHED_SOFTIRQ, sched_balance_softirq);
>
> Intuitively, I would expect that sched_balance_softirq would be run by
> ksoftirqd.  That is, I would expect ksoftirqd to be scheduled

By default, sched_softirq and others run in interrupt context.
ksoftirqd is woken up only in some cases like when we spent too much
time processing softirq in interrupt context or the softirq is raised
outside interrupt context

> (sched_switch event), then the various actions of sched_balance_softirq to
> be executed, and the ksoftirqd to be unscheduled (another ksoftirqd)
> event.
>
> But in practice, I see the code of sched_balance_softirq being executed
> by the idle task, before the ksoftirqd is scheduled (see core 40):

What wakes up ksoftirqd ? and which softirq finally runs in ksoftirqd ?

>
>           <idle>-0     [040]  3611.432554: softirq_entry:        vec=7 [action=SCHED]
>           <idle>-0     [040]  3611.432554: bputs:                sched_balance_softirq: starting nohz
>           <idle>-0     [040]  3611.432554: bputs:                sched_balance_softirq: starting _nohz_idle_balance
>           bt.B.x-12022 [047]  3611.432554: softirq_entry:        vec=1 [action=TIMER]
>           <idle>-0     [040]  3611.432554: bputs:                _nohz_idle_balance.isra.0: searching for a cpu
>           bt.B.x-12033 [003]  3611.432554: softirq_entry:        vec=7 [action=SCHED]
>           <idle>-0     [040]  3611.432554: bputs:                sched_balance_softirq: ending _nohz_idle_balance
>           bt.B.x-12052 [011]  3611.432554: softirq_entry:        vec=7 [action=SCHED]
>           <idle>-0     [040]  3611.432554: bputs:                sched_balance_softirq: nohz returns true ending soft irq
>           <idle>-0     [040]  3611.432554: softirq_exit:         vec=7 [action=SCHED]
>
> For example, idle seems to be running the code in _nohz_idle_balance.
>
> I updated the code of _nohz_idle_balance as follows:
>
> trace_printk("searching for a cpu\n");
>         for_each_cpu_wrap(balance_cpu,  nohz.idle_cpus_mask, this_cpu+1) {
>                 if (!idle_cpu(balance_cpu))
>                         continue;
> trace_printk("found an idle cpu\n");
>
> It prints searching for a cpu, but not found an idle cpu, because the
> ksoftirqd on the core's runqueue makes the core not idle.  This makes the
> whole softirq seem fairly useless when the only idle core is the one
> raising the soft irq.

The typical behavior is:

CPUA                                   CPUB
                                       do_idle
                                         while (!need_resched()) {
                                         ...

kick_ilb
  smp_call_function_single_async(CPUB)
    send_call_function_single_ipi
      raise_ipi  --------------------->    cpuidle exit event
                                           irq_handler_entry
                                             ipi_handler
                                               raise sched_softirq
                                           irq_handler_exit
                                           sorftirq_entry
                                             sched_balance_softirq
                                               __nohe_idle_balance
                                           softirq_exit
                                           cpuidle_enter event

softirq is done in the interrupt context after the irq handler and
CPUB never leaves the while (!need_resched())  loop

In your case, I suspect that you have a racing with the polling mode
and the fact that you leave the while (!need_resched()) loop and call
flush_smp_call_function_queue()

We don't use polling on arm64 so I can't even try to reproduce your case

>
> This is all for the same scenario that I have discussed previously, where
> there are two sockets and an overload of on thread on one and an underload
> of on thread on the other, and all the thread have been marked by numa
> balancing as preferring to be where they are.  Now I am trying Prateek's
> patch series.
>
> thanks,
> julia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ