lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Aug 2019 07:47:40 -0700
From:   Edward Chron <echron@...sta.com>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     David Rientjes <rientjes@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Shakeel Butt <shakeelb@...gle.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Ivan Delalande <colona@...sta.com>
Subject: Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

On Thu, Aug 22, 2019 at 12:15 AM Michal Hocko <mhocko@...nel.org> wrote:
>
> On Wed 21-08-19 15:22:07, Edward Chron wrote:
> > On Wed, Aug 21, 2019 at 12:19 AM David Rientjes <rientjes@...gle.com> wrote:
> > >
> > > On Wed, 21 Aug 2019, Michal Hocko wrote:
> > >
> > > > > vm.oom_dump_tasks is pretty useful, however, so it's curious why you
> > > > > haven't left it enabled :/
> > > >
> > > > Because it generates a lot of output potentially. Think of a workload
> > > > with too many tasks which is not uncommon.
> > >
> > > Probably better to always print all the info for the victim so we don't
> > > need to duplicate everything between dump_tasks() and dump_oom_summary().
> > >
> > > Edward, how about this?
> > >
> > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > > --- a/mm/oom_kill.c
> > > +++ b/mm/oom_kill.c
> > > @@ -420,11 +420,17 @@ static int dump_task(struct task_struct *p, void *arg)
> > >   * State information includes task's pid, uid, tgid, vm size, rss,
> > >   * pgtables_bytes, swapents, oom_score_adj value, and name.
> > >   */
> > > -static void dump_tasks(struct oom_control *oc)
> > > +static void dump_tasks(struct oom_control *oc, struct task_struct *victim)
> > >  {
> > >         pr_info("Tasks state (memory values in pages):\n");
> > >         pr_info("[  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name\n");
> > >
> > > +       /* If vm.oom_dump_tasks is disabled, only show the victim */
> > > +       if (!sysctl_oom_dump_tasks) {
> > > +               dump_task(victim, oc);
> > > +               return;
> > > +       }
> > > +
> > >         if (is_memcg_oom(oc))
> > >                 mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
> > >         else {
> > > @@ -465,8 +471,8 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
> > >                 if (is_dump_unreclaim_slabs())
> > >                         dump_unreclaimable_slab();
> > >         }
> > > -       if (sysctl_oom_dump_tasks)
> > > -               dump_tasks(oc);
> > > +       if (p || sysctl_oom_dump_tasks)
> > > +               dump_tasks(oc, p);
> > >         if (p)
> > >                 dump_oom_summary(oc, p);
> > >  }
> >
> > I would be willing to accept this, though as Michal mentions in his
> > post, it would be very helpful to have the oom_score_adj on the Killed
> > process message.
> >
> > One reason for that is that the Killed process message is the one
> > message that is printed with error priority (pr_err)
> > and so that message can be filtered out and sent to notify support
> > that an OOM event occurred.
> > Putting any information that can be shared in that message is useful
> > from my experience as it the initial point of triage for an OOM event.
> > Even if the full log with per user process is available it the
> > starting point for triage for an OOM event.
> >
> > So from my perspective I would be happy having both, with David's
> > proposal providing a bit of extra information as shown here:
> >
> > Jul 21 20:07:48 linuxserver kernel: [  pid  ]   uid  tgid total_vm
> >  rss pgtables_bytes swapents oom_score_adj name
> > Jul 21 20:07:48 linuxserver kernel: [    547]     0   547    31664
> > 615             299008              0                       0
> > systemd-journal
> >
> > The OOM Killed process message will print as:
> >
> > Jul 21 20:07:48 linuxserver kernel: Out of memory: Killed process 2826
> > (oomprocs) total-vm:1056800kB, anon-rss:1052784kB, file-rss:4kB,
> > shmem-rss:0kB oom_score_adj:1000
> >
> > But if only one one output change is allowed I'd favor the Killed
> > process message since that can be singled due to it's print priority
> > and forwarded.
> >
> > By the way, right now there is redundancy in that the Killed process
> > message is printing vm, rss even if vm.oom_dump_tasks is enabled.
> > I don't see why that is a big deal.
>
> There will always be redundancy there because dump_tasks part is there
> mostly to check the oom victim decision for potential wrong/unexpected
> selection. While "killed..." message is there to inform who has been
> killed. Most people really do care about that part only.
>
> > It is very useful to have all the information that is there.
> > Wouldn't mind also having pgtables too but we would be able to get
> > that from the output of dump_task if that is enabled.
>
> I am not against adding pgrable information there. That memory is going
> to be released when the task dies.

Oh Thank-you, will include that in updated patch as it useful information.

>
> > If it is acceptable to also add the dump_task for the killed process
> > for !sysctl_oom_dump_tasks I can repost the patch including that as
> > well.
>
> Well, I would rather focus on adding the missing pieces to the killed
> task message instead.
>

Will do.

> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ