[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6da7b976-2f95-4fa3-be1c-0a2b7ac29277@linux.alibaba.com>
Date: Wed, 7 Jan 2026 09:12:11 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Brian Foster <bfoster@...hat.com>
Cc: Barry Song <21cnbao@...il.com>, Matthew Wilcox <willy@...radead.org>,
akpm@...ux-foundation.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Hugh Dickins <hughd@...gle.com>,
syzbot+178fff6149127421c2cc@...kaller.appspotmail.com
Subject: Re: [PATCH] mm/shmem: fix uninitialized folio in shmem_symlink
On 1/6/26 11:56 PM, Brian Foster wrote:
> On Tue, Jan 06, 2026 at 11:47:44AM +0800, Baolin Wang wrote:
>>
>>
>> On 1/5/26 9:58 PM, Brian Foster wrote:
>>> On Thu, Dec 25, 2025 at 06:08:08PM +0800, Baolin Wang wrote:
>>>>
>>>>
>>>> On 2025/12/25 12:04, Barry Song wrote:
>>>>> On Thu, Dec 25, 2025 at 6:01 AM Matthew Wilcox <willy@...radead.org> wrote:
>>>>>>
>>>>>> On Wed, Dec 24, 2025 at 10:40:27PM +1300, Barry Song wrote:
>>>>>>> From: Barry Song <baohua@...nel.org>
>>>>>>>
>>>>>>> Uninitialized folio allocated in shmem_symlink() may be accessed
>>>>>>> during swap-out, causing KMSAN BUG:
>>>>>>
>>>>>> This would be an unfortunate way to fix it. The vast majority of
>>>>>> symlinks are short, and we'll never access past the \0 in normal
>>>>>> operation, so we'll be dirtying a lot of cachelines essentially to (1)
>>>>>> shut up an automated tool and (2) optimise a corner case.
>>>>>>
>>>>>> How about this instead which delays zeroing to swapout?
>>>>>
>>>>> Matthew, thank you very much for your review, even during Christmas.
>>>>> I would like to wish you a happy holiday!
>>>>>
>>>>> I am not quite sure, as shm symlinks do not seem very common. Since
>>>>> allocating a folio requires a symname longer than 128 bytes (where
>>>>> 128 == SHORT_SYMLINK_LEN), such cases appear even rarer.
>>>>>
>>>>> BTW, do we need to migrate the owner_2 flag in folio_migrate_flags()?
>>>>> If so, I am not quite sure it is worth changing the hotpath to
>>>>> accommodate this.
>>>>
>>>> +1. At least for me, using the 'PG_owner_2' flag alone to mark this uncommon
>>>> case doesn't seem quite worthwhile.
>>>>
>>>
>>> Also JFYI the post-eof swapout zeroing work (still pending) looks to me
>>> like it would cover the swapout time case [1]. That's just if you wanted
>>> to go that route here; creation time zeroing for the large symlink case
>>> seems reasonable enough to me as well.
>>
>> IMHO, your post-eof zeroing work is intended to zero !uptodate or beyond EOF
>> folios before swap-out, however, the symlink folio is uptodate and not
>> beyond the EOF. I am not sure if it will make your code more complicated
>> when you want to cover this symlink case.
>>
>
> It already handles this particular case. The primary intent is to zero
> all post-eof ranges that can be written out before that write out
> happens. I.e., if you take a look at shmem_writeout(), if the folio
> happens to straddle i_size we zero from i_size to the end of the folio.
Ah, yes. I’ve rechecked your patch set, and you’re right.
>> Why I prefer Barry's fix: First, the symlink folio is marked Uptodate after
>> copying the symlink name, but the whole folio hasn’t been initialized, which
>> seems unreasonable to me. Second, as I said before, using the 'PG_owner_2'
>> flag to mark this uncommon case doesn’t seem worthwhile. Currently, IIUC the
>> 'PG_owner_2' is only used by btrfs; if we ever want to remove the
>> 'PG_owner_2', this uncommon symlink case shouldn’t block its removal.
>>
>
> I think we're talking past eachother a bit here. ;) I'm pointing out
> that outstanding work already covers writeout time zeroing so as to
> avoid duplicate effort if this was trending that way. IOW, I think
> there's no reason to consider the page flag thing, it's more of a
> question of do you want the zeroing at creation time or not..
Yes. It is very strange that a folio is marked uptodate but is not fully
initialized.
> I again don't have a strong opinion on this, but I think I can see
> justification either way. If the focus is a KMSAN splat at writeout
> time, then it makes some sense to address it in that slower/less likely
> path.
>
> That said, if we step back and consider that if this were a buffered
> write we'd tail zero the folio at write time before marking it uptodate,
> this seems like more broadly consistent behavior with other fs' and
> operations. This would be analogous to say XFS allocating a zeroed
> metadata buffer on symlink creation to ensure the underlying block is
> tail zeroed before it is eventually written to disk.
>
> So to your point, maybe the most appropriate thing to do is cover
> zeroing in both places..
Agree.
>> BTW, as I said before, I've reviewed and tested your post-eof work[1]. Maybe
>> you could resend the patch set so that Hugh can take a look when he has
>> time.
>>
>
> Yeah.. I know Hugh is busy and that isn't the highest priority series.
> I'm just catching back up from time off myself so I can give it more
> time before I repost it or anything.
Sure.
Powered by blists - more mailing lists