linux-kernel - Re: [PATCH] mm,oom_reaper: avoid run queue_oom

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZV8SenfRYnkKwqu6@tiehlicka>
Date:   Thu, 23 Nov 2023 09:51:06 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     gaoxu <gaoxu2@...onor.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        yipengxiang <yipengxiang@...onor.com>
Subject: Re: [PATCH] mm,oom_reaper: avoid run queue_oom_reaper if task is not
 oom

On Wed 22-11-23 12:46:44, gaoxu wrote:
> The function queue_oom_reaper tests and sets tsk->signal->oom_mm->flags.
> However, it is necessary to check if 'tsk' is an OOM victim before
> executing 'queue_oom_reaper' because the variable may be NULL.
> 
> We encountered such an issue, and the log is as follows:
> [3701:11_see]Out of memory: Killed process 3154 (system_server)
> total-vm:23662044kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB,
> UID:1000 pgtables:4056kB oom_score_adj:-900

> [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set null_pointer
> [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set unknown_addr

What are these?

> [3701:11_see]Unable to handle kernel NULL pointer dereference at virtual
> address 0000000000000328
> [3701:11_see]user pgtable: 4k pages, 39-bit VAs, pgdp=00000000821de000
> [3701:11_see][0000000000000328] pgd=0000000000000000,
> p4d=0000000000000000,pud=0000000000000000
> [3701:11_see]tracing off
> [3701:11_see]Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [3701:11_see]Call trace:
> [3701:11_see] queue_oom_reaper+0x30/0x170

Could you resolve this offset into the code line please?

> [3701:11_see] __oom_kill_process+0x590/0x860
> [3701:11_see] oom_kill_process+0x140/0x274
> [3701:11_see] out_of_memory+0x2f4/0x54c
> [3701:11_see] __alloc_pages_slowpath+0x5d8/0xaac
> [3701:11_see] __alloc_pages+0x774/0x800
> [3701:11_see] wp_page_copy+0xc4/0x116c
> [3701:11_see] do_wp_page+0x4bc/0x6fc
> [3701:11_see] handle_pte_fault+0x98/0x2a8
> [3701:11_see] __handle_mm_fault+0x368/0x700
> [3701:11_see] do_handle_mm_fault+0x160/0x2cc
> [3701:11_see] do_page_fault+0x3e0/0x818
> [3701:11_see] do_mem_abort+0x68/0x17c
> [3701:11_see] el0_da+0x3c/0xa0
> [3701:11_see] el0t_64_sync_handler+0xc4/0xec
> [3701:11_see] el0t_64_sync+0x1b4/0x1b8
> [3701:11_see]tracing off
> 
> Signed-off-by: Gao Xu <gaoxu2@...onor.com>
> ---
>  mm/oom_kill.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 9e6071fde..3754ab4b6 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -984,7 +984,7 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
>  	}
>  	rcu_read_unlock();
>  
> -	if (can_oom_reap)
> +	if (can_oom_reap && tsk_is_oom_victim(victim))
>  		queue_oom_reaper(victim);

I do not understand. We always do send SIGKILL and call
mark_oom_victim(victim); on victim task when reaching out here. How can
tsk_is_oom_victim can ever be false?

>  
>  	mmdrop(mm);
> -- 
> 2.17.1
> 
> 

-- 
Michal Hocko
SUSE Labs