lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA0PR11MB71857839BD4488371CAC4E9BF8C02@IA0PR11MB7185.namprd11.prod.outlook.com>
Date: Wed, 12 Jun 2024 05:55:20 +0000
From: "Kasireddy, Vivek" <vivek.kasireddy@...el.com>
To: Oscar Salvador <osalvador@...e.de>, Andrew Morton
	<akpm@...ux-foundation.org>
CC: syzbot <syzbot+569ed13f4054f271087b@...kaller.appspotmail.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "muchun.song@...ux.dev"
	<muchun.song@...ux.dev>, "syzkaller-bugs@...glegroups.com"
	<syzkaller-bugs@...glegroups.com>
Subject: RE: [syzbot] [mm?] general protection fault in
 dequeue_hugetlb_folio_nodemask (2)

Hi Oscar,

> 
> On Tue, Jun 11, 2024 at 07:52:06PM +0200, Oscar Salvador wrote:
> > On Tue, Jun 11, 2024 at 07:46:33PM +0200, Oscar Salvador wrote:
> > > On Tue, Jun 11, 2024 at 10:30:05AM -0700, Andrew Morton wrote:
> > > > On Tue, 11 Jun 2024 03:34:25 -0700 syzbot
> <syzbot+569ed13f4054f271087b@...kaller.appspotmail.com> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > >
> > > > Thanks.
> > > >
> > > > > Call Trace:
> > > > >  <TASK>
> > > > >  alloc_hugetlb_folio_nodemask+0xae/0x3f0 mm/hugetlb.c:2603
> > > > >  memfd_alloc_folio+0x15e/0x390 mm/memfd.c:75
> > > > >  memfd_pin_folios+0x1066/0x1720 mm/gup.c:3864
> > > > >  udmabuf_create+0x658/0x11c0 drivers/dma-buf/udmabuf.c:353
> > > > >  udmabuf_ioctl_create drivers/dma-buf/udmabuf.c:420 [inline]
> > > > >  udmabuf_ioctl+0x304/0x4f0 drivers/dma-buf/udmabuf.c:451
> > > > >  vfs_ioctl fs/ioctl.c:51 [inline]
> > > > >  __do_sys_ioctl fs/ioctl.c:907 [inline]
> > > > >  __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
> > > > >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > > >  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > > > >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > >
> > > > I think we can pretty confidently point at the series "mm/gup:
> > > > Introduce memfd_pin_folios() for pinning memfd folios".  I'll drop the
> > > > v14 series.
> > >
> > > jfyi: I am trying to reproduce this locally.
> >
> > Actually, should not memfd_alloc_folio() pass htlb_alloc_mask() instead
> > of GFP_USER to alloc_hugetlb_folio_nodemask? Or at least do
> > GFP_HIGHUSER.
> 
> Ok, I spot the issue.
> memfd_alloc_folio() was calling alloc_hugetlb_folio_nodemask with
> preferred_nid being NUMA_NO_NODE, but that is bad as
> dequeue_hugetlb_folio_nodemask will do:
> 
> zonelist = node_zonelist(nid, gfp_mask)
> 
> which will try to get node_zonelists from nid, but since nid is -1, heh.
> 
> The below patch fixes the issue for me, but I think that the right place
> to fix this up would be alloc_hugetlb_folio_nodemask(), so we can place
> the numa_node_id() if preferred_nid = NUMA_NO_NODE in there as a safety
> net.
> This way we catch this before exploding in case the user was not careful
> enough.
> 
> I will cook up a patch shortly.
Thank you for fixing this issue!

> 
> Another thing is why memfd_alloc_folio uses GFP_USER instead of
> GFP_HIGHUSER, but that maybe because I see that memfd_pin_folios() is
> used by some DMA driver which might not have access to HIGH_MEMORY.
Right, memfd_pin_folios() is used by udmabuf driver for DMA but the reason
why GFP_USER is chosen is because I was following this code in gup.c:
                struct migration_target_control mtc = {
                        .nid = NUMA_NO_NODE,
                        .gfp_mask = GFP_USER | __GFP_NOWARN,
                        .reason = MR_LONGTERM_PIN,
                };

                if (migrate_pages(movable_folio_list, alloc_migration_target,
                                  NULL, (unsigned long)&mtc, MIGRATE_SYNC,
                                  MR_LONGTERM_PIN, NULL)) {

where, alloc_migration_target() does the following to allocate a hugetlb folio:
        if (folio_test_hugetlb(src)) {
                struct hstate *h = folio_hstate(src);

                gfp_mask = htlb_modify_alloc_mask(h, gfp_mask);
                return alloc_hugetlb_folio_nodemask(h, nid,
                                                mtc->nmask, gfp_mask,
                                                htlb_allow_alloc_fallback(mtc->reason));

but I somehow missed the early check in alloc_migration_target() where it does:
        if (nid == NUMA_NO_NODE)
                nid = folio_nid(src);

Thanks,
Vivek

> 
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 8035c6325e3c..2692f0298adc 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -68,12 +68,13 @@ static void memfd_tag_pins(struct xa_state *xas)
>  struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
>  {
>  #ifdef CONFIG_HUGETLB_PAGE
> +	int nid = numa_node_id();
>  	struct folio *folio;
>  	int err;
> 
>  	if (is_file_hugepages(memfd)) {
>  		folio = alloc_hugetlb_folio_nodemask(hstate_file(memfd),
> -						     NUMA_NO_NODE,
> +						     nid,
>  						     NULL,
>  						     GFP_USER,
>  						     false);
> 
> >
> >
> > --
> > Oscar Salvador
> > SUSE Labs
> >
> 
> --
> Oscar Salvador
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ