linux-ext4 - Re: [PATCH] fscrypt: Copy the memcg information to the ciphertext page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y9a2m8uvmXmCVYvE@sol.localdomain>
Date:   Sun, 29 Jan 2023 10:10:35 -0800
From:   Eric Biggers <ebiggers@...nel.org>
To:     "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc:     "Theodore Y . Ts'o" <tytso@....edu>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        linux-fscrypt@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-f2fs-devel@...ts.sourceforge.net,
        stable@...r.kernel.org
Subject: Re: [PATCH] fscrypt: Copy the memcg information to the ciphertext
 page

On Sun, Jan 29, 2023 at 12:18:51PM +0000, Matthew Wilcox (Oracle) wrote:
> Both f2fs and ext4 end up passing the ciphertext page to
> wbc_account_cgroup_owner().  At the moment, the ciphertext page appears
> to belong to no cgroup, so it is accounted to the root_mem_cgroup instead
> of whatever cgroup the original page was in.
> 
> It's hard to say how far back this is a bug.  The crypto code shared
> between ext4 & f2fs was created in May 2015 with commit 0b81d0779072,
> but neither filesystem did anything with memcg_data before then.  memcg
> writeback accounting was added to ext4 in July 2015 in commit 001e4a8775f6
> and it wasn't added to f2fs until January 2018 (commit 578c647879f7).
> 
> I'm going with the ext4 commit since this is the first commit where
> there was a difference in behaviour between encrypted and unencrypted
> filesystems.
> 
> Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support")
> Cc: stable@...r.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> ---
>  fs/crypto/crypto.c | 3 +++
>  1 file changed, 3 insertions(+)

What is the actual effect of this bug?

The bounce pages are short-lived, so surely it doesn't really matter what memory
cgroup they get charged to?

I guess it's really more about the effect on cgroup writeback?  And that's also
the reason why this is a problem here but not e.g. in dm-crypt?

> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index e78be66bbf01..a4e76f96f291 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -205,6 +205,9 @@ struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
>  	}
>  	SetPagePrivate(ciphertext_page);
>  	set_page_private(ciphertext_page, (unsigned long)page);
> +#ifdef CONFIG_MEMCG
> +	ciphertext_page->memcg_data = page->memcg_data;
> +#endif
>  	return ciphertext_page;
>  }

Nothing outside mm/ and include/linux/memcontrol.h does anything with memcg_data
directly.  Are you sure this is the right thing to do here?

Also, this patch causes the following:

[   16.192276] BUG: Bad page state in process kworker/u4:2  pfn:10798a
[   16.192919] page:00000000332f5565 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10798a
[   16.193848] memcg:ffff88810766c000
[   16.194186] flags: 0x200000000000000(node=0|zone=2)
[   16.194642] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
[   16.195356] raw: 0000000000000000 0000000000000000 00000000ffffffff ffff88810766c000
[   16.196061] page dumped because: page still charged to cgroup
[   16.196599] CPU: 0 PID: 33 Comm: kworker/u4:2 Tainted: G                T  6.2.0-rc5-00001-gf84eecbf5db1 #3
[   16.197494] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014
[   16.198343] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
[   16.198899] Call Trace:
[   16.199143]  <TASK>
[   16.199350]  show_stack+0x47/0x56
[   16.199670]  dump_stack_lvl+0x55/0x72
[   16.200019]  dump_stack+0x14/0x18
[   16.200345]  bad_page.cold+0x5e/0x8a
[   16.200685]  free_page_is_bad_report+0x61/0x70
[   16.201111]  free_pcp_prepare+0x13f/0x290
[   16.201486]  free_unref_page+0x27/0x1f0
[   16.201848]  __free_pages+0xa0/0xc0
[   16.202186]  mempool_free_pages+0xd/0x20
[   16.202556]  mempool_free+0x28/0x90
[   16.202889]  fscrypt_free_bounce_page+0x26/0x40
[   16.203322]  ext4_finish_bio+0x1ed/0x240
[   16.203690]  ext4_release_io_end+0x4a/0x100
[   16.204088]  ext4_end_io_rsv_work+0xa8/0x1b0
[   16.204492]  process_one_work+0x27f/0x580
[   16.204874]  worker_thread+0x5a/0x3d0
[   16.205229]  ? process_one_work+0x580/0x580
[   16.205621]  kthread+0x102/0x130
[   16.205929]  ? kthread_exit+0x30/0x30
[   16.206280]  ret_from_fork+0x1f/0x30
[   16.206620]  </TASK>