linux-kernel - Re: [PATCH] mm: oom: Fix race condition between oom_badness and do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 9 Mar 2018 17:34:39 +0530
From:   "Kohli, Gaurav" <gkohli@...eaurora.org>
To:     Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        rientjes@...gle.com
Cc:     akpm@...ux-foundation.org, mhocko@...e.com,
        kirill.shutemov@...ux.intel.com, aarcange@...hat.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-arm-msm@...r.kernel.org
Subject: Re: [PATCH] mm: oom: Fix race condition between oom_badness and
 do_exit of task

On 3/9/2018 4:18 PM, Tetsuo Handa wrote:

> Kohli, Gaurav wrote:
>>> t->alloc_lock is still held when leaving find_lock_task_mm(), which means
>>> that t->mm != NULL. But nothing prevents t from setting t->mm = NULL at
>>> exit_mm() from do_exit() and calling exit_creds() from __put_task_struct(t)
>>> after task_unlock(t) is called. Seems difficult to trigger race window. Maybe
>>> something has preempted because oom_badness() becomes outside of RCU grace
>>> period upon leaving find_lock_task_mm() when called from proc_oom_score().
>> Hi Tetsuo,
>>
>> Yes it is not easy to reproduce seen twice till now and i agree with
>> your analysis. But David has already fixing this in different way,
>> So that also looks better to me:
>>
>> https://patchwork.kernel.org/patch/10265641/
>>
> Yes, I'm aware of that patch.
>
>> But if need to keep that code, So we have to bump up the task
>> reference that's only i can think of now.
> I don't think so, for I think it is safe to call
> has_capability_noaudit(p) with p->alloc_lock held.
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index f2e7dfb..4efcfb8 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -222,7 +222,6 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
>   	 */
>   	points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
>   		mm_pgtables_bytes(p->mm) / PAGE_SIZE;
> -	task_unlock(p);
>   
>   	/*
>   	 * Root processes get 3% bonus, just like the __vm_enough_memory()
> @@ -230,6 +229,7 @@ unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
>   	 */
>   	if (has_capability_noaudit(p, CAP_SYS_ADMIN))
>   		points -= (points * 3) / 100;
> +	task_unlock(p);

Earlier i have thought the same to post this, but this may create 
problem if there are sleeping calls in

has_capability_noaudit ?

>   
>   	/* Normalize to oom_score_adj units */
>   	adj *= totalpages / 1000;
>
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.