lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Aug 2023 13:08:52 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Tong Tiangen <tongtiangen@...wei.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Miaohe Lin <linmiaohe@...wei.com>, wangkefeng.wang@...wei.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: memory-failure: use rcu lock instead of
 tasklist_lock when collect_procs()

On Tue, Aug 22, 2023 at 11:41:41AM +0800, Tong Tiangen wrote:
> 在 2023/8/22 2:33, Matthew Wilcox 写道:
> > On Mon, Aug 21, 2023 at 05:13:12PM +0800, Tong Tiangen wrote:
> > > We can see that CPU1 waiting for CPU0 respond IPI,CPU0 waiting for CPU2
> > > unlock tasklist_lock, CPU2 waiting for CPU1 unlock page->ptl. As a result,
> > > softlockup is triggered.
> > > 
> > > For collect_procs_anon(), we will not modify the tasklist, but only perform
> > > read traversal. Therefore, we can use rcu lock instead of spin lock
> > > tasklist_lock, from this, we can break the softlock chain above.
> > 
> > The only thing that's giving me pause is that there's no discussion
> > about why this is safe.  "We're not modifying it" isn't really enough
> > to justify going from read_lock() to rcu_read_lock().  When you take a
> > normal read_lock(), writers are not permitted and so you see an atomic
> > snapshot of the list.  With rcu_read_lock() you can see inconsistencies.
> 
> Hi Matthew:
> 
> When rcu_read_lock() is used, the task list can be modified during the
> iteration, but cannot be seen during iteration. After the iteration is
> complete, the task list can be updated in the RCU mechanism. Therefore, the
> task list used by iteration can also be considered as a snapshot.

No, that's not true!  You are not iterating a snapshot of the list,
you're iterating the live list.  It will change under you.  RCU provides
you with some guarantees about that list.  See Documentation/RCU/listRCU.rst

> > For example, if new tasks can be added to the tasklist, they may not
> > be seen by an iteration.  Is this OK?
> 
> The newly added tasks does not access the HWPoison page, because the
> HWPoison page has been isolated from the
> buddy(memory_failure()->take_page_off_buddy()). Therefore, it is safe to see
> the newly added task during the iteration and not be seen by iteration.
> 
> Tasks may be removed from the
> > tasklist after they have been seen by the iteration.  Is this OK?
> 
> Task be seen during iteration are deleted from the task list after
> iteration, it's task_struct is not released because reference counting is
> added in __add_to_kill(). Therefore, the subsequent processing of
> kill_procs() is not affected (sending signals to the task deleted from task
> list). so i think it's safe too.

I don't know this code, but it seems unsafe to me.  Look:

collect_procs_anon:
        for_each_process(tsk) {
                struct task_struct *t = task_early_kill(tsk, force_early);
                        add_to_kill_anon_file(t, page, vma, to_kill);

add_to_kill_anon_file:
        __add_to_kill(tsk, p, vma, to_kill, 0, FSDAX_INVALID_PGOFF);

__add_to_kill:
        get_task_struct(tsk);

static inline struct task_struct *get_task_struct(struct task_struct *t)
{
        refcount_inc(&t->usage);
        return t;
}

/**
 * refcount_inc - increment a refcount
 * @r: the refcount to increment
 *
 * Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
 *
 * Provides no memory ordering, it is assumed the caller already has a
 * reference on the object.
 *
 * Will WARN if the refcount is 0, as this represents a possible use-after-free
 * condition.
 */

I don't see anything that prevents that refcount_inc from seeing a zero
refcount.  Usually that would be prevented by tasklist_lock, right?

Andrew, I think this patch is bad and needs to be dropped.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ