lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 17 Feb 2022 17:06:45 -0800 From: John Hubbard <jhubbard@...dia.com> To: Lee Jones <lee.jones@...aro.org>, linux-ext4@...r.kernel.org Cc: Christoph Hellwig <hch@....de>, Dave Chinner <dchinner@...hat.com>, Goldwyn Rodrigues <rgoldwyn@...e.com>, "Darrick J . Wong" <darrick.wong@...cle.com>, Bob Peterson <rpeterso@...hat.com>, Damien Le Moal <damien.lemoal@....com>, Theodore Ts'o <tytso@....edu>, Andreas Gruenbacher <agruenba@...hat.com>, Ritesh Harjani <riteshh@...ux.ibm.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Johannes Thumshirn <jth@...nel.org>, linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org, cluster-devel@...hat.com, linux-kernel@...r.kernel.org Subject: Re: [REPORT] kernel BUG at fs/ext4/inode.c:2620 - page_buffers() On 2/16/22 08:31, Lee Jones wrote: > Good afternoon, ... > I managed to seemingly bisect the issue down to commit: > > 60263d5889e6d ("iomap: fall back to buffered writes for invalidation failures") > > Although it appears to be the belief of the Filesystem community that > this is likely not the cause of the issue and should therefore not be > reverted. They are correct in this matter, imho. :) ... > Darrick seems to suggest that: > > "The BUG report came from page_buffers failing to find any buffer heads > attached to the page." Yes. And looking at the pair of backtraces below, this looks very much like another aspect of the "get_user_pages problem" [1], originally described in Jan Kara's 2018 email [2]. I'm getting close to posting an RFC for the direct IO conversion to FOLL_PIN, but even after that, various parts of the kernel (reclaim, filesystems/block layer) still need to be changed so as to use page_maybe_dma_pinned() to help avoid this problem. There's a bit more than that, actually. [1] https://lwn.net/Articles/753027/ [2] https://www.spinics.net/lists/linux-mm/msg142700.html thanks, -- John Hubbard NVIDIA > > If the reproducer, also massively stripped down from the original > report, would be of any use to you, it can be found further down at > [2]. > > I don't how true this is, but it is my current belief that user-space > should not be able to force the kernel to BUG. This seems to be a > temporary DoS issue. So although not a critically serious security > problem involved memory leakage or data corruption, it could > potentially cause a nuisance if not rectified. > > Any well meaning help with this would be gratefully received. > > Kind regards, > Lee > > [0] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fbug%3Fextid%3D41c966bf0729a530bd8d&data=04%7C01%7Cjhubbard%40nvidia.com%7C107e857de4b940fbe7e708d9f169d4a8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637806259852035011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=5oD2%2BB9iOOSqSh3xwLLzR1vFxqyMtYNivJVQmepj2ww%3D&reserved=0 > > [1] > [ 15.200920] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 15.215877] File: /syzkaller.IsS3Yc/0/bus PID: 1497 Comm: repro > [ 16.718970] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 16.734250] File: /syzkaller.IsS3Yc/5/bus PID: 1512 Comm: repro > [ 17.013871] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 17.028193] File: /syzkaller.IsS3Yc/6/bus PID: 1515 Comm: repro > [ 17.320498] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 17.336115] File: /syzkaller.IsS3Yc/7/bus PID: 1518 Comm: repro > [ 17.617921] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 17.633063] File: /syzkaller.IsS3Yc/8/bus PID: 1521 Comm: repro > [ 18.527260] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 18.544236] File: /syzkaller.IsS3Yc/11/bus PID: 1530 Comm: repro > [ 18.810347] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 18.824721] File: /syzkaller.IsS3Yc/12/bus PID: 1533 Comm: repro > [ 19.099315] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 19.114151] File: /syzkaller.IsS3Yc/13/bus PID: 1536 Comm: repro > [ 19.403882] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 19.418467] File: /syzkaller.IsS3Yc/14/bus PID: 1539 Comm: repro > [ 19.703934] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O! > [ 19.718400] File: /syzkaller.IsS3Yc/15/bus PID: 1542 Comm: repro > [ 26.533129] ------------[ cut here ]------------ > [ 26.540473] WARNING: CPU: 1 PID: 1612 at fs/ext4/inode.c:3576 ext4_set_page_dirty+0xaf/0xc0 > [ 26.553171] Modules linked in: > [ 26.557354] CPU: 1 PID: 1612 Comm: repro Not tainted 5.16.0+ #169 > [ 26.565238] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 > [ 26.576182] RIP: 0010:ext4_set_page_dirty+0xaf/0xc0 > [ 26.583077] Code: 4c 89 ff e8 e3 86 e7 ff 49 f7 07 00 20 00 00 74 19 4c 89 ff 5b 41 5e 41 5f e9 8d 05 f0 ff 48 83 c0 ff 48 89 c3 e9 76 ff ff ff <0f> 0b eb e3 48 83 c0 ff 48 89 c3 eb 9e 0f 0b eb b8 55 48 89 e5 41 > [ 26.607402] RSP: 0018:ffff88810f4ffa10 EFLAGS: 00010246 > [ 26.614646] RAX: ffffea00043bc687 RBX: ffffea00043bc680 RCX: ffffffff9913f86d > [ 26.625115] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffea00043bc680 > [ 26.635137] RBP: 0000000000000400 R08: dffffc0000000000 R09: fffff940008778d1 > [ 26.644923] R10: fffff940008778d1 R11: 0000000000000000 R12: ffff88810e14c000 > [ 26.654807] R13: ffffea00043bc680 R14: ffffea00043bc688 R15: ffffea00043bc680 > [ 26.664812] FS: 00007f27c16d6640(0000) GS:ffff8883ef440000(0000) knlGS:0000000000000000 > [ 26.676238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 26.684212] CR2: 000000000049b3a8 CR3: 000000010f7a6005 CR4: 0000000000370ee0 > [ 26.693896] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 26.703778] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 26.714238] Call Trace: > [ 26.717987] <TASK> > [ 26.721105] folio_mark_dirty+0x72/0xa0 > [ 26.726455] set_page_dirty_lock+0x4a/0x70 > [ 26.732426] unpin_user_pages_dirty_lock+0x101/0x1d0 > [ 26.739369] process_vm_rw_single_vec+0x2f4/0x3c0 > [ 26.745707] ? process_vm_rw+0x4d0/0x4d0 > [ 26.751454] ? mm_access+0xe1/0x120 > [ 26.756495] process_vm_rw+0x2fd/0x4d0 > [ 26.762431] ? __ia32_sys_process_vm_writev+0x80/0x80 > [ 26.769780] ? preempt_count_sub+0xf/0xc0 > [ 26.775021] ? folio_add_lru+0xea/0x110 > [ 26.780260] ? preempt_count_sub+0xf/0xc0 > [ 26.786062] ? _raw_spin_unlock+0x2e/0x50 > [ 26.791676] ? __handle_mm_fault+0x14a7/0x1970 > [ 26.797550] ? handle_mm_fault+0x1d0/0x1d0 > [ 26.802981] ? up_read+0x6f/0x180 > [ 26.807430] ? down_read_trylock+0x13f/0x190 > [ 26.813252] ? down_write_trylock+0x130/0x130 > [ 26.818935] ? handle_mm_fault+0x160/0x1d0 > [ 26.824454] ? do_kern_addr_fault+0x130/0x130 > [ 26.830695] __x64_sys_process_vm_writev+0x71/0x80 > [ 26.837270] do_syscall_64+0x43/0x90 > [ 26.842134] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 26.848994] RIP: 0033:0x44c849 > [ 26.853311] Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > [ 26.877911] RSP: 002b:00007f27c16d6168 EFLAGS: 00000216 ORIG_RAX: 0000000000000137 > [ 26.887953] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000044c849 > [ 26.897571] RDX: 0000000000000001 RSI: 0000000020c22000 RDI: 0000000000000076 > [ 26.907068] RBP: 00007f27c16d61a0 R08: 0000000000000001 R09: 0000000000000000 > [ 26.916433] R10: 0000000020c22fa0 R11: 0000000000000216 R12: 00007ffd3062431e > [ 26.927113] R13: 00007ffd3062431f R14: 0000000000000000 R15: 00007f27c16d6640 > [ 26.936755] </TASK> > [ 26.939785] ---[ end trace 42b5bb79157828eb ]--- > [ 27.160243] ------------[ cut here ]------------ > [ 27.166572] kernel BUG at fs/ext4/inode.c:2620! > [ 27.173362] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > [ 27.180459] CPU: 1 PID: 1616 Comm: repro Tainted: G W 5.16.0+ #169 > [ 27.190304] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 > [ 27.201112] RIP: 0010:mpage_prepare_extent_to_map+0x573/0x580 > [ 27.208692] Code: 08 14 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 3b 84 24 40 01 00 00 75 15 89 d8 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 0f 0b e8 04 39 15 01 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 > [ 27.232612] RSP: 0018:ffff88810f7e6b60 EFLAGS: 00010246 > [ 27.239348] RAX: ffffea00043b4fc7 RBX: 0000000000000067 RCX: ffffffff9913ea61 > [ 27.248720] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffea00043b4fc0 > [ 27.257899] RBP: ffff88810f7e6cf0 R08: dffffc0000000000 R09: fffff940008769f9 > [ 27.266846] R10: fffff940008769f9 R11: 0000000000000000 R12: 0000000000000000 > [ 27.276050] R13: ffff88810f7e6be0 R14: ffffea00043b4fc0 R15: ffff88810f7e6f58 > [ 27.285119] FS: 00007f27c16d6640(0000) GS:ffff8883ef440000(0000) knlGS:0000000000000000 > [ 27.295466] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 27.302935] CR2: 0000000020002002 CR3: 000000010cb70006 CR4: 0000000000370ee0 > [ 27.312837] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 27.322697] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 27.332451] Call Trace: > [ 27.336060] <TASK> > [ 27.339164] ? ext4_iomap_swap_activate+0x10/0x10 > [ 27.346171] ? preempt_count_sub+0xf/0xc0 > [ 27.352456] ? page_writeback_cpu_online+0x1f0/0x1f0 > [ 27.359598] ? ext4_init_io_end+0x18/0x90 > [ 27.365427] ? kmem_cache_alloc+0xf2/0x200 > [ 27.371638] ext4_writepages+0x823/0x1c50 > [ 27.377047] ? kernel_text_address+0xa8/0xc0 > [ 27.382587] ? unwind_get_return_address+0x25/0x40 > [ 27.388649] ? __rcu_read_unlock+0x8d/0x320 > [ 27.394134] ? __rcu_read_lock+0x20/0x20 > [ 27.399094] ? preempt_count_sub+0xf/0xc0 > [ 27.404220] ? ext4_readpage+0x110/0x110 > [ 27.409494] ? stack_trace_save+0x120/0x120 > [ 27.414838] ? __is_insn_slot_addr+0x58/0x60 > [ 27.420186] ? kernel_text_address+0xa8/0xc0 > [ 27.425531] ? __kernel_text_address+0x9/0x40 > [ 27.431355] ? unwind_get_return_address+0x25/0x40 > [ 27.437415] ? stack_trace_save+0xdb/0x120 > [ 27.442722] ? stack_trace_snprint+0xc0/0xc0 > [ 27.448310] do_writepages+0x20b/0x3a0 > [ 27.453245] ? __kasan_slab_alloc+0x43/0xb0 > [ 27.458532] ? filter_irq_stacks+0x3d/0x80 > [ 27.463792] ? __writepage+0xb0/0xb0 > [ 27.468366] ? __iomap_dio_rw+0x1c2/0xec0 > [ 27.473579] ? iomap_dio_rw+0x5/0x30 > [ 27.478152] ? ext4_file_write_iter+0x8a8/0xde0 > [ 27.484091] ? do_iter_readv_writev+0x2ce/0x360 > [ 27.490002] ? do_iter_write+0x109/0x370 > [ 27.495400] ? iter_file_splice_write+0x4b6/0x770 > [ 27.501658] ? direct_splice_actor+0x7b/0x90 > [ 27.507225] ? splice_direct_to_actor+0x309/0x570 > [ 27.513487] ? do_splice_direct+0x172/0x230 > [ 27.519352] ? do_sendfile+0x567/0x960 > [ 27.524605] ? __x64_sys_sendfile64+0x104/0x150 > [ 27.531035] ? do_syscall_64+0x43/0x90 > [ 27.536259] ? entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 27.543606] ? do_syscall_64+0x43/0x90 > [ 27.548942] ? entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 27.556281] ? _raw_spin_lock+0x120/0x120 > [ 27.561984] ? __ext4_handle_dirty_metadata+0x22d/0x510 > [ 27.569186] filemap_write_and_wait_range+0x200/0x230 > [ 27.576185] ? filemap_range_needs_writeback+0x400/0x400 > [ 27.583632] ? ext4_mark_iloc_dirty+0x66c/0x6b0 > [ 27.589986] ? kmem_cache_alloc_trace+0xe7/0x230 > [ 27.596373] ? __iomap_dio_rw+0x1c2/0xec0 > [ 27.601967] __iomap_dio_rw+0x525/0xec0 > [ 27.609616] ? jbd2_journal_stop+0x481/0x5b0 > [ 27.615606] ? iomap_dio_complete+0x2a0/0x2a0 > [ 27.622054] ? generic_update_time+0xde/0x130 > [ 27.628225] ? __mnt_drop_write_file+0xd/0x60 > [ 27.634306] ? file_update_time+0x1cd/0x210 > [ 27.640161] ? kernel_text_address+0xa8/0xc0 > [ 27.646114] ? file_remove_privs+0x2b0/0x2b0 > [ 27.652278] iomap_dio_rw+0x5/0x30 > [ 27.657149] ext4_file_write_iter+0x8a8/0xde0 > [ 27.663323] ? ext4_file_read_iter+0x1e0/0x1e0 > [ 27.669800] ? ____kasan_kmalloc+0xd1/0xf0 > [ 27.675676] ? direct_splice_actor+0x7b/0x90 > [ 27.681778] ? splice_direct_to_actor+0x309/0x570 > [ 27.688357] ? do_splice_direct+0x172/0x230 > [ 27.694256] ? do_sendfile+0x567/0x960 > [ 27.699605] ? __x64_sys_sendfile64+0x104/0x150 > [ 27.706129] ? entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 27.713462] do_iter_readv_writev+0x2ce/0x360 > [ 27.719894] ? generic_file_rw_checks+0xd0/0xd0 > [ 27.726446] ? memcpy+0x3c/0x60 > [ 27.731234] ? security_file_permission+0x47/0x270 > [ 27.738026] do_iter_write+0x109/0x370 > [ 27.743417] iter_file_splice_write+0x4b6/0x770 > [ 27.749824] ? splice_from_pipe+0x170/0x170 > [ 27.755630] ? generic_file_splice_read+0x2d0/0x380 > [ 27.765041] ? splice_shrink_spd+0x40/0x40 > [ 27.770668] ? is_mmconf_reserved+0x240/0x240 > [ 27.776487] ? kcalloc+0x1b/0x20 > [ 27.780737] ? splice_from_pipe+0x170/0x170 > [ 27.786278] direct_splice_actor+0x7b/0x90 > [ 27.791885] splice_direct_to_actor+0x309/0x570 > [ 27.797680] ? do_splice_direct+0x230/0x230 > [ 27.803067] ? pipe_to_sendpage+0x1b0/0x1b0 > [ 27.808395] ? security_file_permission+0x47/0x270 > [ 27.814761] do_splice_direct+0x172/0x230 > [ 27.819842] ? splice_direct_to_actor+0x570/0x570 > [ 27.825936] ? security_file_permission+0x47/0x270 > [ 27.832354] do_sendfile+0x567/0x960 > [ 27.837426] ? do_pwritev+0x3d0/0x3d0 > [ 27.842258] ? __se_sys_futex+0x1b1/0x2c0 > [ 27.847619] ? restore_fpregs_from_fpstate+0xc4/0x190 > [ 27.854277] __x64_sys_sendfile64+0x104/0x150 > [ 27.859919] ? __ia32_sys_sendfile+0x170/0x170 > [ 27.865622] ? switch_fpu_return+0x97/0x120 > [ 27.871204] do_syscall_64+0x43/0x90 > [ 27.875661] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 27.882673] RIP: 0033:0x44c849 > [ 27.887077] Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 c1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > [ 27.911882] RSP: 002b:00007f27c16d6178 EFLAGS: 00000203 ORIG_RAX: 0000000000000028 > [ 27.921306] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000044c849 > [ 27.930587] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000003 > [ 27.939877] RBP: 00007f27c16d61a0 R08: 0000000000000000 R09: 0000000000000000 > [ 27.949975] R10: 0000000080000005 R11: 0000000000000203 R12: 00007ffd3062431e > [ 27.959298] R13: 00007ffd3062431f R14: 0000000000000000 R15: 00007f27c16d6640 > [ 27.968616] </TASK> > [ 27.971668] Modules linked in: > [ 27.975922] ---[ end trace 42b5bb79157828ec ]--- > [ 27.982710] RIP: 0010:mpage_prepare_extent_to_map+0x573/0x580 > [ 27.990558] Code: 08 14 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 3b 84 24 40 01 00 00 75 15 89 d8 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 0f 0b e8 04 39 15 01 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 > [ 28.015009] RSP: 0018:ffff88810f7e6b60 EFLAGS: 00010246 > [ 28.021763] RAX: ffffea00043b4fc7 RBX: 0000000000000067 RCX: ffffffff9913ea61 > [ 28.031325] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffea00043b4fc0 > [ 28.040622] RBP: ffff88810f7e6cf0 R08: dffffc0000000000 R09: fffff940008769f9 > [ 28.050140] R10: fffff940008769f9 R11: 0000000000000000 R12: 0000000000000000 > [ 28.059313] R13: ffff88810f7e6be0 R14: ffffea00043b4fc0 R15: ffff88810f7e6f58 > [ 28.068502] FS: 00007f27c16d6640(0000) GS:ffff8883ef440000(0000) knlGS:0000000000000000 > [ 28.079454] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 28.087279] CR2: 0000000020002002 CR3: 000000010cb70006 CR4: 0000000000370ee0 > [ 28.096763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 28.105974] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [2] > // https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsyzkaller.appspot.com%2Fbug%3Fid%3D906354c4596539d9561ee6cb6d8c54cda38fc3c2&data=04%7C01%7Cjhubbard%40nvidia.com%7C107e857de4b940fbe7e708d9f169d4a8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637806259852035011%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=WhaMEtC2yWJva1OO%2Fp0BIYRwADAlClWQPee7Trf9DpQ%3D&reserved=0 > // autogenerated by syzkaller (https://github.com/google/syzkaller) > > #define _GNU_SOURCE > > #include <arpa/inet.h> > #include <dirent.h> > #include <endian.h> > #include <errno.h> > #include <fcntl.h> > #include <net/if.h> > #include <net/if_arp.h> > #include <netinet/in.h> > #include <pthread.h> > #include <sched.h> > #include <signal.h> > #include <stdarg.h> > #include <stdbool.h> > #include <stdint.h> > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <sys/ioctl.h> > #include <sys/mount.h> > #include <sys/prctl.h> > #include <sys/resource.h> > #include <sys/socket.h> > #include <sys/stat.h> > #include <sys/syscall.h> > #include <sys/time.h> > #include <sys/types.h> > #include <sys/uio.h> > #include <sys/wait.h> > #include <time.h> > #include <unistd.h> > > #include <linux/capability.h> > #include <linux/futex.h> > #include <linux/genetlink.h> > #include <linux/if_addr.h> > #include <linux/if_ether.h> > #include <linux/if_link.h> > #include <linux/if_tun.h> > #include <linux/in6.h> > #include <linux/ip.h> > #include <linux/neighbour.h> > #include <linux/net.h> > #include <linux/netlink.h> > #include <linux/rtnetlink.h> > #include <linux/tcp.h> > #include <linux/veth.h> > > static void sleep_ms(uint64_t ms) > { > usleep(ms * 1000); > } > > static uint64_t current_time_ms(void) > { > struct timespec ts; > if (clock_gettime(CLOCK_MONOTONIC, &ts)) > exit(1); > return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000; > } > > static void use_temporary_dir(void) > { > char tmpdir_template[] = "./syzkaller.XXXXXX"; > char* tmpdir = mkdtemp(tmpdir_template); > if (!tmpdir) > exit(1); > if (chmod(tmpdir, 0777)) > exit(1); > if (chdir(tmpdir)) > exit(1); > } > > static void thread_start(void* (*fn)(void*), void* arg) > { > pthread_t th; > pthread_attr_t attr; > pthread_attr_init(&attr); > pthread_attr_setstacksize(&attr, 128 << 10); > int i = 0; > for (; i < 100; i++) { > if (pthread_create(&th, &attr, fn, arg) == 0) { > pthread_attr_destroy(&attr); > return; > } > if (errno == EAGAIN) { > usleep(50); > continue; > } > break; > } > exit(1); > } > > typedef struct { > int state; > } event_t; > > static void event_init(event_t* ev) > { > ev->state = 0; > } > > static void event_reset(event_t* ev) > { > ev->state = 0; > } > > static void event_set(event_t* ev) > { > if (ev->state) > exit(1); > __atomic_store_n(&ev->state, 1, __ATOMIC_RELEASE); > syscall(SYS_futex, &ev->state, FUTEX_WAKE | FUTEX_PRIVATE_FLAG, 1000000); > } > > static void event_wait(event_t* ev) > { > while (!__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE)) > syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, 0); > } > > static int event_isset(event_t* ev) > { > return __atomic_load_n(&ev->state, __ATOMIC_ACQUIRE); > } > > static int event_timedwait(event_t* ev, uint64_t timeout) > { > uint64_t start = current_time_ms(); > uint64_t now = start; > for (;;) { > uint64_t remain = timeout - (now - start); > struct timespec ts; > ts.tv_sec = remain / 1000; > ts.tv_nsec = (remain % 1000) * 1000 * 1000; > syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, &ts); > if (__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE)) > return 1; > now = current_time_ms(); > if (now - start > timeout) > return 0; > } > } > > #define IFLA_IPVLAN_FLAGS 2 > #define IPVLAN_MODE_L3S 2 > #undef IPVLAN_F_VEPA > #define IPVLAN_F_VEPA 2 > > #define TUN_IFACE "syz_tun" > #define LOCAL_MAC 0xaaaaaaaaaaaa > #define REMOTE_MAC 0xaaaaaaaaaabb > #define LOCAL_IPV4 "172.20.20.170" > #define REMOTE_IPV4 "172.20.20.187" > #define LOCAL_IPV6 "fe80::aa" > #define REMOTE_IPV6 "fe80::bb" > > #define IFF_NAPI 0x0010 > > #define DEVLINK_FAMILY_NAME "devlink" > > #define DEVLINK_CMD_PORT_GET 5 > #define DEVLINK_ATTR_BUS_NAME 1 > #define DEVLINK_ATTR_DEV_NAME 2 > #define DEVLINK_ATTR_NETDEV_NAME 7 > > #define DEV_IPV4 "172.20.20.%d" > #define DEV_IPV6 "fe80::%02x" > #define DEV_MAC 0x00aaaaaaaaaa > > #define WG_GENL_NAME "wireguard" > enum wg_cmd { > WG_CMD_GET_DEVICE, > WG_CMD_SET_DEVICE, > }; > enum wgdevice_attribute { > WGDEVICE_A_UNSPEC, > WGDEVICE_A_IFINDEX, > WGDEVICE_A_IFNAME, > WGDEVICE_A_PRIVATE_KEY, > WGDEVICE_A_PUBLIC_KEY, > WGDEVICE_A_FLAGS, > WGDEVICE_A_LISTEN_PORT, > WGDEVICE_A_FWMARK, > WGDEVICE_A_PEERS, > }; > enum wgpeer_attribute { > WGPEER_A_UNSPEC, > WGPEER_A_PUBLIC_KEY, > WGPEER_A_PRESHARED_KEY, > WGPEER_A_FLAGS, > WGPEER_A_ENDPOINT, > WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL, > WGPEER_A_LAST_HANDSHAKE_TIME, > WGPEER_A_RX_BYTES, > WGPEER_A_TX_BYTES, > WGPEER_A_ALLOWEDIPS, > WGPEER_A_PROTOCOL_VERSION, > }; > enum wgallowedip_attribute { > WGALLOWEDIP_A_UNSPEC, > WGALLOWEDIP_A_FAMILY, > WGALLOWEDIP_A_IPADDR, > WGALLOWEDIP_A_CIDR_MASK, > }; > > #define MAX_FDS 30 > > #define XT_TABLE_SIZE 1536 > #define XT_MAX_ENTRIES 10 > > struct xt_counters { > uint64_t pcnt, bcnt; > }; > > struct ipt_getinfo { > char name[32]; > unsigned int valid_hooks; > unsigned int hook_entry[5]; > unsigned int underflow[5]; > unsigned int num_entries; > unsigned int size; > }; > > struct ipt_get_entries { > char name[32]; > unsigned int size; > uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)]; > }; > > struct ipt_replace { > char name[32]; > unsigned int valid_hooks; > unsigned int num_entries; > unsigned int size; > unsigned int hook_entry[5]; > unsigned int underflow[5]; > unsigned int num_counters; > struct xt_counters* counters; > uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)]; > }; > > struct ipt_table_desc { > const char* name; > struct ipt_getinfo info; > struct ipt_replace replace; > }; > > #define IPT_BASE_CTL 64 > #define IPT_SO_SET_REPLACE (IPT_BASE_CTL) > #define IPT_SO_GET_INFO (IPT_BASE_CTL) > #define IPT_SO_GET_ENTRIES (IPT_BASE_CTL + 1) > > struct arpt_getinfo { > char name[32]; > unsigned int valid_hooks; > unsigned int hook_entry[3]; > unsigned int underflow[3]; > unsigned int num_entries; > unsigned int size; > }; > > struct arpt_get_entries { > char name[32]; > unsigned int size; > uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)]; > }; > > struct arpt_replace { > char name[32]; > unsigned int valid_hooks; > unsigned int num_entries; > unsigned int size; > unsigned int hook_entry[3]; > unsigned int underflow[3]; > unsigned int num_counters; > struct xt_counters* counters; > uint64_t entrytable[XT_TABLE_SIZE / sizeof(uint64_t)]; > }; > > #define ARPT_BASE_CTL 96 > #define ARPT_SO_SET_REPLACE (ARPT_BASE_CTL) > #define ARPT_SO_GET_INFO (ARPT_BASE_CTL) > #define ARPT_SO_GET_ENTRIES (ARPT_BASE_CTL + 1) > > static void loop(); > > static int wait_for_loop(int pid) > { > if (pid < 0) > exit(1); > int status = 0; > while (waitpid(-1, &status, __WALL) != pid) { > } > return WEXITSTATUS(status); > } > > static int do_sandbox_none(void) > { > if (unshare(CLONE_NEWPID)) { > } > int pid = fork(); > if (pid != 0) > return wait_for_loop(pid); > > if (unshare(CLONE_NEWNET)) { > } > loop(); > exit(1); > } > > #define FS_IOC_SETFLAGS _IOW('f', 2, long) > static void remove_dir(const char* dir) > { > int iter = 0; > DIR* dp = 0; > retry: > while (umount2(dir, MNT_DETACH | UMOUNT_NOFOLLOW) == 0) { > } > dp = opendir(dir); > if (dp == NULL) { > if (errno == EMFILE) { > exit(1); > } > exit(1); > } > struct dirent* ep = 0; > while ((ep = readdir(dp))) { > if (strcmp(ep->d_name, ".") == 0 || strcmp(ep->d_name, "..") == 0) > continue; > char filename[FILENAME_MAX]; > snprintf(filename, sizeof(filename), "%s/%s", dir, ep->d_name); > while (umount2(filename, MNT_DETACH | UMOUNT_NOFOLLOW) == 0) { > } > struct stat st; > if (lstat(filename, &st)) > exit(1); > if (S_ISDIR(st.st_mode)) { > remove_dir(filename); > continue; > } > int i; > for (i = 0;; i++) { > if (unlink(filename) == 0) > break; > if (errno == EPERM) { > int fd = open(filename, O_RDONLY); > if (fd != -1) { > long flags = 0; > if (ioctl(fd, FS_IOC_SETFLAGS, &flags) == 0) { > } > close(fd); > continue; > } > } > if (errno == EROFS) { > break; > } > if (errno != EBUSY || i > 100) > exit(1); > if (umount2(filename, MNT_DETACH | UMOUNT_NOFOLLOW)) > exit(1); > } > } > closedir(dp); > for (int i = 0;; i++) { > if (rmdir(dir) == 0) > break; > if (i < 100) { > if (errno == EPERM) { > int fd = open(dir, O_RDONLY); > if (fd != -1) { > long flags = 0; > if (ioctl(fd, FS_IOC_SETFLAGS, &flags) == 0) { > } > close(fd); > continue; > } > } > if (errno == EROFS) { > break; > } > if (errno == EBUSY) { > if (umount2(dir, MNT_DETACH | UMOUNT_NOFOLLOW)) > exit(1); > continue; > } > if (errno == ENOTEMPTY) { > if (iter < 100) { > iter++; > goto retry; > } > } > } > exit(1); > } > } > > static void kill_and_wait(int pid, int* status) > { > kill(-pid, SIGKILL); > kill(pid, SIGKILL); > for (int i = 0; i < 100; i++) { > if (waitpid(-1, status, WNOHANG | __WALL) == pid) > return; > usleep(1000); > } > DIR* dir = opendir("/sys/fs/fuse/connections"); > if (dir) { > for (;;) { > struct dirent* ent = readdir(dir); > if (!ent) > break; > if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0) > continue; > char abort[300]; > snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort", > ent->d_name); > int fd = open(abort, O_WRONLY); > if (fd == -1) { > continue; > } > if (write(fd, abort, 1) < 0) { > } > close(fd); > } > closedir(dir); > } else { > } > while (waitpid(-1, status, __WALL) != pid) { > } > } > > static void close_fds() > { > for (int fd = 3; fd < MAX_FDS; fd++) > close(fd); > } > > struct thread_t { > int created, call; > event_t ready, done; > }; > > static struct thread_t threads[16]; > static void execute_call(int call); > static int running; > > static void* thr(void* arg) > { > struct thread_t* th = (struct thread_t*)arg; > for (;;) { > event_wait(&th->ready); > event_reset(&th->ready); > execute_call(th->call); > __atomic_fetch_sub(&running, 1, __ATOMIC_RELAXED); > event_set(&th->done); > } > return 0; > } > > static void execute_one(void) > { > int i, call, thread; > for (call = 0; call < 10; call++) { > for (thread = 0; thread < (int)(sizeof(threads) / sizeof(threads[0])); > thread++) { > struct thread_t* th = &threads[thread]; > if (!th->created) { > th->created = 1; > event_init(&th->ready); > event_init(&th->done); > event_set(&th->done); > thread_start(thr, th); > } > if (!event_isset(&th->done)) > continue; > event_reset(&th->done); > th->call = call; > __atomic_fetch_add(&running, 1, __ATOMIC_RELAXED); > event_set(&th->ready); > event_timedwait(&th->done, 50); > break; > } > } > for (i = 0; i < 100 && __atomic_load_n(&running, __ATOMIC_RELAXED); i++) > sleep_ms(1); > close_fds(); > } > > static void execute_one(void); > > #define WAIT_FLAGS __WALL > > static void loop(void) > { > int iter = 0; > for (;; iter++) { > char cwdbuf[32]; > sprintf(cwdbuf, "./%d", iter); > if (mkdir(cwdbuf, 0777)) > exit(1); > int pid = fork(); > if (pid < 0) > exit(1); > if (pid == 0) { > if (chdir(cwdbuf)) > exit(1); > execute_one(); > exit(0); > } > int status = 0; > uint64_t start = current_time_ms(); > for (;;) { > if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid) { > break; > } > sleep_ms(1); > if (current_time_ms() - start < 5000) > continue; > kill_and_wait(pid, &status); > break; > } > remove_dir(cwdbuf); > } > } > > uint64_t r[5] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff, > 0xffffffffffffffff, 0x0}; > > void execute_call(int call) > { > intptr_t res = 0; > switch (call) { > case 0: > memcpy((void*)0x20000080, "./bus\000", 6); > res = syscall(__NR_open, 0x20000080ul, 0x14d842ul, 0ul); > if (res != -1) > r[0] = res; > break; > case 1: > memcpy((void*)0x20000000, "/proc/self/exe\000", 15); > res = syscall(__NR_openat, 0xffffff9c, 0x20000000ul, 0ul, 0ul); > if (res != -1) > r[1] = res; > break; > case 2: > memcpy((void*)0x20002000, "./bus\000", 6); > res = syscall(__NR_open, 0x20002000ul, 0x143042ul, 0ul); > if (res != -1) > r[2] = res; > break; > case 3: > syscall(__NR_ftruncate, r[2], 0x2008002ul); > break; > case 4: > memcpy((void*)0x20000400, "./bus\000", 6); > res = syscall(__NR_open, 0x20000400ul, 0x14103eul, 0ul); > if (res != -1) > r[3] = res; > break; > case 5: > syscall(__NR_mmap, 0x20000000ul, 0x600000ul, 0x7ffffeul, 0x11ul, r[3], 0ul); > break; > case 6: > syscall(__NR_sendfile, r[0], r[1], 0ul, 0x80000005ul); > break; > case 7: > res = syscall(__NR_gettid); > if (res != -1) > r[4] = res; > break; > case 8: > *(uint64_t*)0x20c22000 = 0x2034afa4; > *(uint64_t*)0x20c22008 = 0x1f80; > *(uint64_t*)0x20c22fa0 = 0x20000080; > *(uint64_t*)0x20c22fa8 = 0x2034afa5; > syscall(__NR_process_vm_writev, r[4], 0x20c22000ul, 1ul, 0x20c22fa0ul, 1ul, > 0ul); > break; > case 9: > syscall(__NR_sendfile, r[0], r[1], 0ul, 0x80000005ul); > break; > } > } > int main(void) > { > syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); > syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > > use_temporary_dir(); > do_sandbox_none(); > > return 0; > } >
Powered by blists - more mailing lists