lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB8ipk8xXWzc_PurHwVPd9-azN4B5OD=MYQP+Oze1kmbom0avQ@mail.gmail.com>
Date:   Fri, 18 Nov 2022 20:08:54 +0800
From:   Xuewen Yan <xuewen.yan94@...il.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Xuewen Yan <xuewen.yan@...soc.com>, peterz@...radead.org,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, ke.wang@...soc.com,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/rt: Use cpu_active_mask to prevent
 rto_push_irq_work's dead loop

On Fri, Nov 18, 2022 at 6:16 AM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Mon, 14 Nov 2022 20:04:53 +0800
> Xuewen Yan <xuewen.yan@...soc.com> wrote:
>
> > +++ b/kernel/sched/rt.c
> > @@ -2219,6 +2219,7 @@ static int rto_next_cpu(struct root_domain *rd)
> >  {
> >       int next;
> >       int cpu;
> > +     struct cpumask tmp_cpumask;
>
> If you have a machine with thousands of CPUs, this will likely kill the
> stack.
Ha, I did not take it into account. Thanks!

>
> >
> >       /*
> >        * When starting the IPI RT pushing, the rto_cpu is set to -1,
> > @@ -2238,6 +2239,11 @@ static int rto_next_cpu(struct root_domain *rd)
> >               /* When rto_cpu is -1 this acts like cpumask_first() */
> >               cpu = cpumask_next(rd->rto_cpu, rd->rto_mask);
> >
> > +             cpumask_and(&tmp_cpumask, rd->rto_mask, cpu_active_mask);
> > +             if (rd->rto_cpu == -1 && cpumask_weight(&tmp_cpumask) == 1 &&
> > +                 cpumask_test_cpu(smp_processor_id(), &tmp_cpumask))
> > +                     break;
> > +
>
> Kill the above.
>
> >               rd->rto_cpu = cpu;
> >
> >               if (cpu < nr_cpu_ids) {
>
> Why not just add here:
>
>                         if (!cpumask_test_cpu(cpu, cpu_active_mask))
>                                 continue;
>                         return cpu;
>                 }
>
> ?
Let's consider this scenario:
the online_cpu_mask is 0x03(cpu0/1),the active_cpu_mask is
0x01(cpu0),the rto cpu is cpu0,
the rto_mask is 0x01, and the irq cpu is cpu0, as a result,  the first
loop, the rto_cpu would be -1,
but the loop < rto_loop_next, on  next loop, because of the rto_cpu is
-1, so the next rto cpu would
be cpu0 still, as a result, the cpu0 would push rt tasks to
cpu1(inactive cpu) while running in the irq_work.

So we should judge whether the current cpu(the only one active cpu) is
the next loop's cpu.

Thanks!

>
> -- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ