lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pndttci1xgd.fsf@axis.com>
Date: Thu, 7 Nov 2024 23:38:26 +0100
From: Waqar Hameed <waqar.hameed@...s.com>
To: Zhihao Cheng <chengzhihao1@...wei.com>
CC: Richard Weinberger <richard@....at>, Sascha Hauer
	<s.hauer@...gutronix.de>, <kernel@...s.com>, <linux-mtd@...ts.infradead.org>,
	<linux-kernel@...r.kernel.org>, Ryder Wang <rydercoding@...mail.com>
Subject: Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit

On Thu, Nov 07, 2024 at 16:39 +0800 Zhihao Cheng <chengzhihao1@...wei.com> wrote:

> 在 2024/10/9 22:46, Waqar Hameed 写道:
>> Running
>>    rm -f /etc/test-file.bin
>>    dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync
>> in a loop, with `CONFIG_UBIFS_FS_AUTHENTICATION`, KASAN reports:
>>    BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
>>    Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153
>>    Call trace:
>>     dump_backtrace+0x0/0x340
>>     show_stack+0x18/0x24
>>     dump_stack_lvl+0x9c/0xbc
>>     print_address_description.constprop.0+0x74/0x2b0
>>     kasan_report+0x1d8/0x1f0
>>     kasan_check_range+0xf8/0x1a0
>>     memcpy+0x84/0xf4
>>     ubifs_tnc_end_commit+0xa5c/0x1950
>>     do_commit+0x4e0/0x1340
>>     ubifs_bg_thread+0x234/0x2e0
>>     kthread+0x36c/0x410
>>     ret_from_fork+0x10/0x20
>>    Allocated by task 401:
>>     kasan_save_stack+0x38/0x70
>>     __kasan_kmalloc+0x8c/0xd0
>>     __kmalloc+0x34c/0x5bc
>>     tnc_insert+0x140/0x16a4
>>     ubifs_tnc_add+0x370/0x52c
>>     ubifs_jnl_write_data+0x5d8/0x870
>>     do_writepage+0x36c/0x510
>>     ubifs_writepage+0x190/0x4dc
>>     __writepage+0x58/0x154
>>     write_cache_pages+0x394/0x830
>>     do_writepages+0x1f0/0x5b0
>>     filemap_fdatawrite_wbc+0x170/0x25c
>>     file_write_and_wait_range+0x140/0x190
>>     ubifs_fsync+0xe8/0x290
>>     vfs_fsync_range+0xc0/0x1e4
>>     do_fsync+0x40/0x90
>>     __arm64_sys_fsync+0x34/0x50
>>     invoke_syscall.constprop.0+0xa8/0x260
>>     do_el0_svc+0xc8/0x1f0
>>     el0_svc+0x34/0x70
>>     el0t_64_sync_handler+0x108/0x114
>>     el0t_64_sync+0x1a4/0x1a8
>>    Freed by task 403:
>>     kasan_save_stack+0x38/0x70
>>     kasan_set_track+0x28/0x40
>>     kasan_set_free_info+0x28/0x4c
>>     __kasan_slab_free+0xd4/0x13c
>
> Hi Waqar, is that line 2639 ?
> 2540 sstatic int tnc_delete()
> 2541 {
> 2608         if (!znode->parent) {
> 2609                 while (znode->child_cnt == 1 && znode->level != 0) {
> 2634                         if (zp->cnext) {
> ...
> 2638                         } else
> 2639                                 kfree(zp);
> 2644 }
>

`faddr2line` doesn't work for that func+offset. It just complains with

  bad symbol size: sym_addr: 0xc0830f81 cur_sym_addr: 0xc0831464
  
The offset and size hints that it should be somewhere at the end of
`tnc_delete()`. I tried with `addr2line` and the addresses around that
offsets points to the `kfree()` on line 2639. So yes, it would say
that's the offending one.

>>     kfree+0xc4/0x3a0
>>     tnc_delete+0x3f4/0xe40
>>     ubifs_tnc_remove_range+0x368/0x73c
>>     ubifs_tnc_remove_ino+0x29c/0x2e0
>>     ubifs_jnl_delete_inode+0x150/0x260
>>     ubifs_evict_inode+0x1d4/0x2e4
>>     evict+0x1c8/0x450
>>     iput+0x2a0/0x3c4
>>     do_unlinkat+0x2cc/0x490
>>     __arm64_sys_unlinkat+0x90/0x100
>>     invoke_syscall.constprop.0+0xa8/0x260
>>     do_el0_svc+0xc8/0x1f0
>>     el0_svc+0x34/0x70
>>     el0t_64_sync_handler+0x108/0x114
>>     el0t_64_sync+0x1a4/0x1a8
>> 
>
> Looks like there is one possibility:
> 1. tnc_insert() triggers a TNC tree split:
>    zroot
>    /
>   z_p1
>   /
> zn
>   z_p1 is full, after inserting zn_new(key order is smaller that zn) under z_p1,
>  zn->parent is switched to z_p2, but zn->cparent is still z_p1:
>    zroot
>    /    \
>   z_p1  z_p2
>   /      \
> zn_new   zn
> 2. tnc_delete() removes all znodes except the 'zn':
>    zroot
>       \
>       z_p2
>         \
>         zn
>     TNC tree is collapsed, zroot and z_p2 are freed:
>     zroot'(zn)
> 3. get_znodes_to_commit() finds only one znode(zn, which is also zroot),
> zn->cparent is not updated and still points to z_p1(which was freed).
> 4. write_index() accesses the zn->cparent->zbranch, which triggers an UAF!
>
> Try following modification to verify whether the problem is fixed:
> diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
> index a55e04822d16..7c43e0ccf6d4 100644
> --- a/fs/ubifs/tnc_commit.c
> +++ b/fs/ubifs/tnc_commit.c
> @@ -657,6 +657,8 @@ static int get_znodes_to_commit(struct ubifs_info *c)
>                 znode->alt = 0;
>                 cnext = find_next_dirty(znode);
>                 if (!cnext) {
> +                       ubifs_assert(c, !znode->parent);
> +                       znode->cparent = NULL;
>                         znode->cnext = c->cnext;
>                         break;
>                 }

Yup, I think this is it! Nice work!

I've started a test run with the following patch:

diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index a55e04822d16..4c8150d5ed65 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -657,6 +657,11 @@ static int get_znodes_to_commit(struct ubifs_info *c)
 		znode->alt = 0;
 		cnext = find_next_dirty(znode);
 		if (!cnext) {
+			ubifs_assert(c, !znode->parent);
+			if (znode->cparent) {
+				printk("%s:%d\n", __func__, __LINE__);
+			}
+			znode->cparent = NULL;
 			znode->cnext = c->cnext;
 			break;
 		}

and can already see that the print hit a couple of times in a few hours
(250 iterations). I'll let it spin for a little longer (recall that the
other patch "masked" the problem for almost 800 iterations).

This was quite intricate and I really enjoyed your little breakdown.
Thank you very much for the discussions/collaboration! I'll let you now
as soon as I have an updated test result (and thus send a new version).

P.S
I wonder how many systems that might have experienced this
use-after-free and got random memory corruptions (or other security
issues). This bug has been there since v4.20.
D.S

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ