lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGLTkvH6CzQXz4oD39_xtArBt3upk-F=gf4-LPoswagGg@mail.gmail.com>
Date:   Mon, 17 May 2021 12:33:38 -0700
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Huangzhaoyang <huangzhaoyang@...il.com>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        Ziwei Dai <ziwei.dai@...soc.com>, Ke Wang <ke.wang@...soc.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [[RFC]PATCH] psi: fix race between psi_trigger_create and psimon

On Mon, May 17, 2021 at 11:36 AM Johannes Weiner <hannes@...xchg.org> wrote:
>
> CC Suren

Thanks!

>
> On Mon, May 17, 2021 at 05:04:09PM +0800, Huangzhaoyang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> >
> > Race detected between psimon_new and psimon_old as shown below, which
> > cause panic by accessing invalid psi_system->poll_wait->wait_queue_entry
> > and psi_system->poll_timer->entry->next. It is not necessary to reinit
> > resource of psi_system when psi_trigger_create.

resource of psi_system will not be reinitialized because
init_waitqueue_head(&group->poll_wait) and friends are initialized
only during the creation of the first trigger for that group (see this
condition: https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L1119).

> >
> > psi_trigger_create      psimon_new     psimon_old
> >  init_waitqueue_head                    finish_wait
> >                                           spin_lock(lock_old)
> >       spin_lock_init(lock_new)
> >  wake_up_process(psimon_new)
> >
> >                         finish_wait
> >                           spin_lock(lock_new)
> >                             list_del       list_del

Could you please clarify this race a bit? I'm having trouble
deciphering this diagram. I'm guessing psimon_new/psimon_old refer to
a new trigger being created while an old one is being deleted, so it
seems like a race between psi_trigger_create/psi_trigger_destroy. The
combination of trigger_lock and RCU should be protecting us from that
but maybe I missed something?
I'm excluding a possibility of a race between psi_trigger_create with
another existing trigger on the same group because the codepath
calling init_waitqueue_head(&group->poll_wait) happens only when the
first trigger for that group is created. Therefore if there is an
existing trigger in that group that codepath will not be taken.

> >
> > Signed-off-by: ziwei.dai <ziwei.dai@...soc.com>
> > Signed-off-by: ke.wang <ke.wang@...soc.com>
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> > ---
> >  kernel/sched/psi.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index cc25a3c..d00e585 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -182,6 +182,8 @@ struct psi_group psi_system = {
> >
> >  static void psi_avgs_work(struct work_struct *work);
> >
> > +static void poll_timer_fn(struct timer_list *t);
> > +
> >  static void group_init(struct psi_group *group)
> >  {
> >       int cpu;
> > @@ -201,6 +203,8 @@ static void group_init(struct psi_group *group)
> >       memset(group->polling_total, 0, sizeof(group->polling_total));
> >       group->polling_next_update = ULLONG_MAX;
> >       group->polling_until = 0;
> > +     init_waitqueue_head(&group->poll_wait);
> > +     timer_setup(&group->poll_timer, poll_timer_fn, 0);
>
> This makes sense.

Well, this means we initialize resources for triggers in each psi
group even if the user never creates any triggers. Current logic
initializes them when the first trigger in the group gets created.

>
> >       rcu_assign_pointer(group->poll_task, NULL);
> >  }
> >
> > @@ -1157,7 +1161,6 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
> >                       return ERR_CAST(task);
> >               }
> >               atomic_set(&group->poll_wakeup, 0);
> > -             init_waitqueue_head(&group->poll_wait);
> >               wake_up_process(task);
> >               timer_setup(&group->poll_timer, poll_timer_fn, 0);
>
> This looks now unncessary?
>
> >               rcu_assign_pointer(group->poll_task, task);
> > @@ -1233,7 +1236,6 @@ static void psi_trigger_destroy(struct kref *ref)
> >                * But it might have been already scheduled before
> >                * that - deschedule it cleanly before destroying it.
> >                */
> > -             del_timer_sync(&group->poll_timer);
>
> And this looks wrong. Did you mean to delete the timer_setup() line
> instead?

I would like to get more details about this race before trying to fix
it. Please clarify.
Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ