[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9a2m8uvmXmCVYvE@sol.localdomain>
Date: Sun, 29 Jan 2023 10:10:35 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc: "Theodore Y . Ts'o" <tytso@....edu>,
Jaegeuk Kim <jaegeuk@...nel.org>,
linux-fscrypt@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-ext4@...r.kernel.org, linux-f2fs-devel@...ts.sourceforge.net,
stable@...r.kernel.org
Subject: Re: [PATCH] fscrypt: Copy the memcg information to the ciphertext
page
On Sun, Jan 29, 2023 at 12:18:51PM +0000, Matthew Wilcox (Oracle) wrote:
> Both f2fs and ext4 end up passing the ciphertext page to
> wbc_account_cgroup_owner(). At the moment, the ciphertext page appears
> to belong to no cgroup, so it is accounted to the root_mem_cgroup instead
> of whatever cgroup the original page was in.
>
> It's hard to say how far back this is a bug. The crypto code shared
> between ext4 & f2fs was created in May 2015 with commit 0b81d0779072,
> but neither filesystem did anything with memcg_data before then. memcg
> writeback accounting was added to ext4 in July 2015 in commit 001e4a8775f6
> and it wasn't added to f2fs until January 2018 (commit 578c647879f7).
>
> I'm going with the ext4 commit since this is the first commit where
> there was a difference in behaviour between encrypted and unencrypted
> filesystems.
>
> Fixes: 001e4a8775f6 ("ext4: implement cgroup writeback support")
> Cc: stable@...r.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> ---
> fs/crypto/crypto.c | 3 +++
> 1 file changed, 3 insertions(+)
What is the actual effect of this bug?
The bounce pages are short-lived, so surely it doesn't really matter what memory
cgroup they get charged to?
I guess it's really more about the effect on cgroup writeback? And that's also
the reason why this is a problem here but not e.g. in dm-crypt?
> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index e78be66bbf01..a4e76f96f291 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -205,6 +205,9 @@ struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
> }
> SetPagePrivate(ciphertext_page);
> set_page_private(ciphertext_page, (unsigned long)page);
> +#ifdef CONFIG_MEMCG
> + ciphertext_page->memcg_data = page->memcg_data;
> +#endif
> return ciphertext_page;
> }
Nothing outside mm/ and include/linux/memcontrol.h does anything with memcg_data
directly. Are you sure this is the right thing to do here?
Also, this patch causes the following:
[ 16.192276] BUG: Bad page state in process kworker/u4:2 pfn:10798a
[ 16.192919] page:00000000332f5565 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10798a
[ 16.193848] memcg:ffff88810766c000
[ 16.194186] flags: 0x200000000000000(node=0|zone=2)
[ 16.194642] raw: 0200000000000000 0000000000000000 dead000000000122 0000000000000000
[ 16.195356] raw: 0000000000000000 0000000000000000 00000000ffffffff ffff88810766c000
[ 16.196061] page dumped because: page still charged to cgroup
[ 16.196599] CPU: 0 PID: 33 Comm: kworker/u4:2 Tainted: G T 6.2.0-rc5-00001-gf84eecbf5db1 #3
[ 16.197494] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014
[ 16.198343] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
[ 16.198899] Call Trace:
[ 16.199143] <TASK>
[ 16.199350] show_stack+0x47/0x56
[ 16.199670] dump_stack_lvl+0x55/0x72
[ 16.200019] dump_stack+0x14/0x18
[ 16.200345] bad_page.cold+0x5e/0x8a
[ 16.200685] free_page_is_bad_report+0x61/0x70
[ 16.201111] free_pcp_prepare+0x13f/0x290
[ 16.201486] free_unref_page+0x27/0x1f0
[ 16.201848] __free_pages+0xa0/0xc0
[ 16.202186] mempool_free_pages+0xd/0x20
[ 16.202556] mempool_free+0x28/0x90
[ 16.202889] fscrypt_free_bounce_page+0x26/0x40
[ 16.203322] ext4_finish_bio+0x1ed/0x240
[ 16.203690] ext4_release_io_end+0x4a/0x100
[ 16.204088] ext4_end_io_rsv_work+0xa8/0x1b0
[ 16.204492] process_one_work+0x27f/0x580
[ 16.204874] worker_thread+0x5a/0x3d0
[ 16.205229] ? process_one_work+0x580/0x580
[ 16.205621] kthread+0x102/0x130
[ 16.205929] ? kthread_exit+0x30/0x30
[ 16.206280] ret_from_fork+0x1f/0x30
[ 16.206620] </TASK>
Powered by blists - more mailing lists