[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201803092118.CCH34154.HOVLQFOFMJtFOS@I-love.SAKURA.ne.jp>
Date: Fri, 9 Mar 2018 21:18:28 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: gkohli@...eaurora.org, rientjes@...gle.com
Cc: akpm@...ux-foundation.org, mhocko@...e.com,
kirill.shutemov@...ux.intel.com, aarcange@...hat.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-arm-msm@...r.kernel.org
Subject: Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task
Kohli, Gaurav wrote:
> On 3/9/2018 4:18 PM, Tetsuo Handa wrote:
>
> > Kohli, Gaurav wrote:
> >>> t->alloc_lock is still held when leaving find_lock_task_mm(), which means
> >>> that t->mm != NULL. But nothing prevents t from setting t->mm = NULL at
> >>> exit_mm() from do_exit() and calling exit_creds() from __put_task_struct(t)
> >>> after task_unlock(t) is called. Seems difficult to trigger race window. Maybe
> >>> something has preempted because oom_badness() becomes outside of RCU grace
> >>> period upon leaving find_lock_task_mm() when called from proc_oom_score().
> >> Hi Tetsuo,
> >>
> >> Yes it is not easy to reproduce seen twice till now and i agree with
> >> your analysis. But David has already fixing this in different way,
> >> So that also looks better to me:
> >>
> >> https://patchwork.kernel.org/patch/10265641/
> >>
> > Yes, I'm aware of that patch.
> >
> >> But if need to keep that code, So we have to bump up the task
> >> reference that's only i can think of now.
> > I don't think so, for I think it is safe to call
> > has_capability_noaudit(p) with p->alloc_lock held.
> >
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index f2e7dfb..4efcfb8 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -222,7 +222,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
> > */
> > points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
> > mm_pgtables_bytes(p->mm) / PAGE_SIZE;
> > - task_unlock(p);
> >
> > /*
> > * Root processes get 3% bonus, just like the __vm_enough_memory()
> > @@ -230,6 +229,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
> > */
> > if (has_capability_noaudit(p, CAP_SYS_ADMIN))
> > points -= (points * 3) / 100;
> > + task_unlock(p);
>
> Earlier i have thought the same to post this, but this may create
> problem if there are sleeping calls in
>
> has_capability_noaudit ?
has_capability_noaudit() does not sleep. See what has_ns_capability_noaudit() is doing.
>
> >
> > /* Normalize to oom_score_adj units */
> > adj *= totalpages / 1000;
> >
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>
>
Powered by blists - more mailing lists