lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877dsjessq.fsf@yhuang-dev.intel.com>
Date:   Thu, 24 Sep 2020 11:51:17 +0800
From:   "Huang\, Ying" <ying.huang@...el.com>
To:     Rafael Aquini <aquini@...hat.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org
Subject: Re: [PATCH] mm: swapfile: avoid split_swap_cluster() NULL pointer dereference

Rafael Aquini <aquini@...hat.com> writes:
> The bug here is quite simple: split_swap_cluster() misses checking for
> lock_cluster() returning NULL before committing to change cluster_info->flags.

I don't think so.  We shouldn't run into this situation firstly.  So the
"fix" hides the real bug instead of fixing it.  Just like we call
VM_BUG_ON_PAGE(!PageLocked(head), head) in split_huge_page_to_list()
instead of returning if !PageLocked(head) silently.

> The fundamental problem has nothing to do with allocating, or not allocating
> a swap cluster, but it has to do with the fact that the THP deferred split scan
> can transiently race with swapcache insertion, and the fact that when you run
> your swap area on rotational storage cluster_info is _always_ NULL.
> split_swap_cluster() needs to check for lock_cluster() returning NULL because
> that's one possible case, and it clearly fails to do so.

If there's a race, we should fix the race.  But the code path for
swapcache insertion is,

add_to_swap()
  get_swap_page() /* Return if fails to allocate */
  add_to_swap_cache()
    SetPageSwapCache()

While the code path to split THP is,

split_huge_page_to_list()
  if PageSwapCache()
    split_swap_cluster()

Both code paths are protected by the page lock.  So there should be some
other reasons to trigger the bug.

And again, for HDD, a THP shouldn't have PageSwapCache() set at the
first place.  If so, the bug is that the flag is set and we should fix
the setting.

> Run a workload that cause multiple THP COW, and add a memory hogger to create
> memory pressure so you'll force the reclaimers to kick the registered
> shrinkers. The trigger is not heavy swapping, and that's probably why
> most swap test cases don't hit it. The window is tight, but you will get the
> NULL pointer dereference.

Do you have a script to reproduce the bug?

> Regardless you find furhter bugs, or not, this patch is needed to correct a
> blunt coding mistake.

As above.  I don't agree with that.

Best Regards,
Huang, Ying

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ