lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2076817.1675434996@warthog.procyon.org.uk>
Date:   Fri, 03 Feb 2023 14:36:36 +0000
From:   David Howells <dhowells@...hat.com>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     dhowells@...hat.com, David Hildenbrand <david@...hat.com>,
        syzbot <syzbot+a440341a59e3b7142895@...kaller.appspotmail.com>,
        hch@....de, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] general protection fault in skb_dequeue (3)

I think I have managed to isolate the bug to the read side of sendfile() or
the pipe in the middle by the following:

In iter_file_splice_write(), I allocate a permanent page:

+	mutex_lock(&splice_tmp_lock);
+	if (!splice_tmp) {
+		pr_notice("alloc splice_tmp\n");
+		splice_tmp = alloc_page(GFP_USER);
+		if (splice_tmp) {
+			SetPageDebugMark(splice_tmp);
+			page_ref_add(splice_tmp, 100);
+		}
+	}
+	mutex_unlock(&splice_tmp_lock);
+	if (!splice_tmp)
+		return -ENOMEM;
+

and then stick it into the BVEC iter to be handed over to vfs_iter_write()
instead of buf->page:

-			array[n].bv_page = buf->page;
+			array[n].bv_page = splice_tmp;
+			trace_page_ref_set(splice_tmp, 887);
 			array[n].bv_len = this_len;
 			array[n].bv_offset = buf->offset;

that prevents vfs_iter_write() from ever seeing the pages from the pipe - but
the crash still happens even with this change.

One thing that does make me wonder is that I've made bio_release_pages()
always call __bio_release_pages() and made the latter dump all the pages in
the bio to the trace with val=999 - but some of the time, I don't see that
happening, so I'm wondering if there are some bio structs with pointers to
released pages floating around and not getting cleaned up.

I've added some event-specific tracepoints using the page_ref_set tracepoint,
using val=N as a key indicate the event:

val=888 -> Page seen in iter_file_splice_write() with PG_debug_mark set.
val=887 -> splice_tmp subbed for page
val=777 -> Page obtained from append_pipe() in iov_iter_extract_pipe_pages()
val=623 -> Page listed in bio_endio()
val=666 -> Page added by __bio_add_page()
val=98x -> xth page listed by __bio_release_pages()
val=-1  -> put_page()

Here's an excerpt from the trace when it looks right (note I removed
mapcount=0 from all lines):

page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=777
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=666
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=623
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=980
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=888
page_ref_set: pfn=0x10642f flags=debug_mark count=101 mapping=0000000000000000 mt=0 val=887
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|dirty|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x10b70b flags=debug_mark count=0 mapping=0000000000000000 mt=0 val=-1 ret=1
page_ref_set: pfn=0x116209 flags=locked|uptodate|debug_mark count=3 mapping=00000000d367da24 mt=1 val=666
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|writeback|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|writeback|debug_mark count=1 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_set: pfn=0x116209 flags=waiters|uptodate|lru|writeback|debug_mark count=2 mapping=00000000d367da24 mt=1 val=623
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|debug_mark count=1 mapping=00000000d367da24 mt=1 val=-1 ret=0

pfn=0x10b70b is a page pulled out of the pipe, pfn=0x10642f is splice_tmp (has
count=101) and pfn=0x116209 is presumably the page the data is to be written
to.

And when it looks off:

page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=777
page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=666
page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=888
page_ref_set: pfn=0x11e6bc flags=debug_mark count=101 mapping=0000000000000000 mt=0 val=887
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|dirty|debug_mark count=2 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x106fca flags=debug_mark count=0 mapping=0000000000000000 mt=0 val=-1 ret=1
page_ref_set: pfn=0x11399a flags=locked|uptodate|debug_mark count=3 mapping=000000003bd7df47 mt=1 val=666
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|writeback|debug_mark count=2 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|writeback|debug_mark count=1 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|debug_mark count=1 mapping=000000003bd7df47 mt=1 val=-1 ret=0

pfn=0x106fca is the page from the pipe, pfn=0x11e6bc is splice_tmp and
pfn=0x11399a is the destination.

In the first trace, there's a val=980 line (page released from bio), but not
in the second trace.

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ