[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFTs51XU1rKqSscdFK69RUCtaKrLJ1r5pdXSBJ1qc3=ZMNorBw@mail.gmail.com>
Date: Sun, 7 Nov 2021 20:09:04 -0800
From: Peter Oskolkov <posk@...k.io>
To: Thierry Delisle <tdelisle@...terloo.ca>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>, linux-mm@...ck.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-api@...r.kernel.org, Paul Turner <pjt@...gle.com>,
Ben Segall <bsegall@...gle.com>,
Peter Oskolkov <posk@...gle.com>,
Andrei Vagin <avagin@...gle.com>, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH v0.8 3/6] sched/umcg: implement UMCG syscalls
On Fri, Nov 5, 2021 at 4:48 PM Thierry Delisle <tdelisle@...terloo.ca> wrote:
>
> On 2021-11-04 3:58 p.m., Peter Oskolkov wrote:
> > +/*
> > + * Try to wake up. May be called with preempt_disable set. May be called
> > + * cross-process.
> > + *
> > + * Note: umcg_ttwu succeeds even if ttwu fails: see wait/wake state
> > + * ordering logic.
> > + */
> > +static int umcg_ttwu(u32 next_tid, int wake_flags)
> > +{
> > + struct task_struct *next;
> > +
> > + rcu_read_lock();
> > + next = find_task_by_vpid(next_tid);
> > + if (!next || !umcg_wakeup_allowed(next)) {
> > + rcu_read_unlock();
> > + return -ESRCH;
> > + }
> > +
> > + /* The result of ttwu below is ignored. */
> > + try_to_wake_up(next, TASK_NORMAL, wake_flags);
> > + rcu_read_unlock();
> > +
> > + return 0;
> > +}
>
> Doesn't try_to_wake_up return different values based on whether or not a
> task
> was woken up? I think it could be useful to propagate that result instead of
> always returning zero. Even if it only helps for debugging.
>
The protocol is to mark the wakee as UMCG_TASK_RUNNING and call ttwu;
if you look at umcg_idle_loop(), you will see that the ordering is set
so that the wakee will indeed either wake or not go to sleep at all,
regardless of whether ttwu succeeds. So in terms of correctness we do
not need to check ttwu result; returning anything different here "for
debugging"? Maybe. If/when this is merged, feel free to send a patch.
Or maybe I'll add this in a next patchset version...
>
>
> > +static bool enqueue_idle_worker(struct umcg_task __user *ut_worker)
> > +{
> > + u64 __user *node = &ut_worker->idle_workers_ptr;
> > + u64 __user *head_ptr;
> > + u64 first = (u64)node;
> > + u64 head;
> > +
> > + if (get_user(head, node) || !head)
> > + return false;
> > +
> > + head_ptr = (u64 __user *)head;
> > +
> > + /* Mark the worker as pending. */
> > + if (put_user(UMCG_IDLE_NODE_PENDING, node))
> > + return false;
> > +
> > + /* Make the head point to the worker. */
> > + if (xchg_user_64(head_ptr, &first))
> > + return false;
> > +
> > + /* Make the worker point to the previous head. */
> > + if (put_user(first, node))
> > + return false;
> > +
> > + return true;
> > +}
>
> If the last two operation return false, whichever task tries to consume the
> list could deadlock, depending on whether or not the ensuing
> force_sig(SIGKILL); reaches the consuming task. Does the force_sig kill
> the task or the entire process. Is it possible to consume this list from a
> different process that shares the memory? I'm wondering if the last
> two "return false" should attempt to retract the
> UMCG_IDLE_NODE_PENDING.
The function will return false only on a memory access error, and the
whole process will be killed as a result. At least this is the
intention. Do you think the code does anything different?
Powered by blists - more mailing lists