lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4660f164-b3e3-28a0-9898-718c5fa6b84d@I-love.SAKURA.ne.jp>
Date:   Sat, 4 Aug 2018 22:45:03 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     mhocko@...nel.org, David Rientjes <rientjes@...gle.com>
Cc:     syzbot <syzbot+bab151e82a4e973fa325@...kaller.appspotmail.com>,
        cgroups@...r.kernel.org, hannes@...xchg.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        syzkaller-bugs@...glegroups.com, vdavydov.dev@...il.com
Subject: Re: WARNING in try_charge

syzbot is hitting WARN(1) because of mem_cgroup_out_of_memory() == false.
At first I suspected that syzbot is hitting

  static bool oom_kill_memcg_victim(struct oom_control *oc)
  {
          if (oc->chosen_memcg == NULL || oc->chosen_memcg == INFLIGHT_VICTIM)
                  return oc->chosen_memcg;

case because

  /* We have one or more terminating processes at this point. */
  oc->chosen_task = INFLIGHT_VICTIM;

is not called. But since that patch was dropped from next-20180803, syzbot
seems to be hitting a different race condition
( https://syzkaller.appspot.com/text?tag=CrashLog&x=12071654400000 ).

Therefore, next culprit I suspect is

    mm, oom: remove oom_lock from oom_reaper

    oom_reaper used to rely on the oom_lock since e2fe14564d33 ("oom_reaper:
    close race with exiting task").  We do not really need the lock anymore
    though.  212925802454 ("mm: oom: let oom_reap_task and exit_mmap run
    concurrently") has removed serialization with the exit path based on the
    mm reference count and so we do not really rely on the oom_lock anymore.

    Tetsuo was arguing that at least MMF_OOM_SKIP should be set under the lock
    to prevent from races when the page allocator didn't manage to get the
    freed (reaped) memory in __alloc_pages_may_oom but it sees the flag later
    on and move on to another victim.  Although this is possible in principle
    let's wait for it to actually happen in real life before we make the
    locking more complex again.

    Therefore remove the oom_lock for oom_reaper paths (both exit_mmap and
    oom_reap_task_mm).  The reaper serializes with exit_mmap by mmap_sem +
    MMF_OOM_SKIP flag.  There is no synchronization with out_of_memory path
    now.

which is in next-20180803, and my "mm, oom: Fix unnecessary killing of additional processes."
( https://marc.info/?i=1533389386-3501-4-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp )
could mitigate it. Michal and David, please respond.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ