linux-kernel - Re: [PATCH v2] mm/oom_kill: count global and memory cgroup oom kills

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <b132f9d1-8898-5301-b7e5-1b3d622e4993@yandex-team.ru>
Date:   Wed, 13 Sep 2017 10:35:22 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     PINTU KUMAR <pintu_agarwal@...oo.com>,
        Michal Hocko <mhocko@...nel.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Guschin <guroan@...il.com>,
        David Rientjes <rientjes@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH v2] mm/oom_kill: count global and memory cgroup oom kills

On 13.09.2017 07:51, PINTU KUMAR wrote:
> 
> 
> Hi,
> 
> I have submitted a similar patch 2 years ago (Oct/2015).
> But at that time the patch was rejected.
> Here is the history:
> https://lkml.org/lkml/2015/10/1/372
> 
> Now I see the similar patch got accepted. At least the initial idea and the objective were same.
> Even I were not included here.
> On one side I feel happy that my initial idea got accepted now.
> But on the other side it really hurts :(
> 

If this makes you feel better: mine version also fixes uncertainty in memory cgroup statistics.

> 
> Thanks,
> Pintu
> 
> 
> On Monday 5 June 2017, 7:57:57 PM IST, Konstantin Khlebnikov <khlebnikov@...dex-team.ru> wrote:
> 
> 
> On 05.06.2017 11:50, Michal Hocko wrote:
>  > On Thu 25-05-17 13:28:30, Konstantin Khlebnikov wrote:
>  >> Show count of oom killer invocations in /proc/vmstat and count of
>  >> processes killed in memory cgroup in knob "memory.events"
>  >> (in memory.oom_control for v1 cgroup).
>  >>
>  >> Also describe difference between "oom" and "oom_kill" in memory
>  >> cgroup documentation. Currently oom in memory cgroup kills tasks
>  >> iff shortage has happened inside page fault.
>  >>
>  >> These counters helps in monitoring oom kills - for now
>  >> the only way is grepping for magic words in kernel log.
>  >
>  > Yes this is less than optimal and the counter sounds like a good step
>  > forward. I have 2 comments to the patch though.
>  >
>  > [...]
>  >
>  >> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>  >> index 899949bbb2f9..42296f7001da 100644
>  >> --- a/include/linux/memcontrol.h
>  >> +++ b/include/linux/memcontrol.h
>  >> @@ -556,8 +556,11 @@ static inline void mem_cgroup_count_vm_event(struct mm_struct *mm,
>  >>
>  >>      rcu_read_lock();
>  >>      memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
>  >> -    if (likely(memcg))
>  >> +    if (likely(memcg)) {
>  >>          this_cpu_inc(memcg->stat->events[idx]);
>  >> +        if (idx == OOM_KILL)
>  >> +            cgroup_file_notify(&memcg->events_file);
>  >> +    }
>  >>      rcu_read_unlock();
>  >
>  > Well, this is ugly. I see how you want to share the global counter and
>  > the memcg event which needs the notification. But I cannot say this
>  > would be really easy to follow. Can we have at least a comment in
>  > memcg_event_item enum definition?
> 
> Yep, this is a little bit ugly.
> But this funciton is static-inline and idx always constant so resulting code is fine.
> 
>  >
>  >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>  >> index 04c9143a8625..dd30a045ef5b 100644
>  >> --- a/mm/oom_kill.c
>  >> +++ b/mm/oom_kill.c
>  >> @@ -876,6 +876,11 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
>  >>      /* Get a reference to safely compare mm after task_unlock(victim) */
>  >>      mm = victim->mm;
>  >>      mmgrab(mm);
>  >> +
>  >> +    /* Raise event before sending signal: reaper must see this */
>  >> +    count_vm_event(OOM_KILL);
>  >> +    mem_cgroup_count_vm_event(mm, OOM_KILL);
>  >> +
>  >>      /*
>  >>      * We should send SIGKILL before setting TIF_MEMDIE in order to prevent
>  >>      * the OOM victim from depleting the memory reserves from the user
>  >
>  > Why don't you count tasks which share mm with the oom victim?
> 
> Yes, this makes sense. But these kills are not logged thus counter will differs from logged events.
> Also these tasks might live in different cgroups, so counting to mm owner isn't correct.
> 
> 
>  > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>  > index 0e2c925e7826..9a95947a60ba 100644
>  > --- a/mm/oom_kill.c
>  > +++ b/mm/oom_kill.c
>  > @@ -924,6 +924,8 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
>  >          */
>  >          if (unlikely(p->flags & PF_KTHREAD))
>  >              continue;
>  > +        count_vm_event(OOM_KILL);
>  > +        count_memcg_event_mm(mm, OOM_KILL);
>  >          do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
>  >      }
>  >      rcu_read_unlock();
>  >
>  > Other than that looks good to me.
>  > Acked-by: Michal Hocko <mhocko@...e.com>
>  >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@...ck.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
>