[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZWX0IMfR3S3rRzen@tiehlicka>
Date: Tue, 28 Nov 2023 15:07:28 +0100
From: Michal Hocko <mhocko@...e.com>
To: Gregory Price <gourry.memverge@...il.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, arnd@...db.de, tglx@...utronix.de,
luto@...nel.org, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
tj@...nel.org, ying.huang@...el.com,
Gregory Price <gregory.price@...verge.com>
Subject: Re: [RFC PATCH 04/11] mm/mempolicy: modify get_mempolicy call stack
to take a task argument
On Wed 22-11-23 16:11:53, Gregory Price wrote:
[...]
> @@ -928,7 +929,16 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
> * vma/shared policy at addr is NULL. We
> * want to return MPOL_DEFAULT in this case.
> */
> - mm = current->mm;
> + if (task == current) {
> + /*
> + * original behavior allows a kernel task changing its
> + * own policy to avoid the condition in get_task_mm,
> + * so we'll directly access
> + */
> + mm = task->mm;
> + mmget(mm);
Do we actually have any kernel thread that would call this? Does it
actually make sense to support?
> + } else
> + mm = get_task_mm(task);
> mmap_read_lock(mm);
> vma = vma_lookup(mm, addr);
> if (!vma) {
> @@ -947,8 +957,10 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
> return -EINVAL;
> else {
> /* take a reference of the task policy now */
> - pol = current->mempolicy;
> + task_lock(task);
> + pol = task->mempolicy;
> mpol_get(pol);
> + task_unlock(task);
> }
>
> if (!pol) {
> @@ -962,12 +974,13 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
> vma = NULL;
> mmap_read_unlock(mm);
> err = lookup_node(mm, addr);
> + mmput(mm);
> if (err < 0)
> goto out;
> *policy = err;
> - } else if (pol == current->mempolicy &&
> + } else if (pol == task->mempolicy &&
> pol->mode == MPOL_INTERLEAVE) {
> - *policy = next_node_in(current->il_prev, pol->nodes);
> + *policy = next_node_in(task->il_prev, pol->nodes);
This is racy without task_lock which I do not think is helde but it also
seems this is not a big deal. pol is ref. counted so it won't go away
and if the task->mempolicy changes then the return value could be bogus
but this seems acceptable. It would be good to put a comment here that
this is actually deliberate.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists