linux-kernel - Re: [syzbot] [iomap?] kernel BUG in folio_end

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <871pma0xkm.fsf@jogness.linutronix.de>
Date: Fri, 07 Nov 2025 14:47:29 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Joanne Koong <joannelkoong@...il.com>, syzbot
 <syzbot+3686758660f980b402dc@...kaller.appspotmail.com>,
 "amurray@...goodpenguin.co.uk" <amurray@...goodpenguin.co.uk>,
 brauner@...nel.org, chao@...nel.org, djwong@...nel.org,
 jaegeuk@...nel.org, linux-f2fs-devel@...ts.sourceforge.net,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-xfs@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [iomap?] kernel BUG in folio_end_read (2)

On 2025-11-07, Petr Mladek <pmladek@...e.com> wrote:
> What about?
>
> /*
>  * Return true when @lpos1 is lower than @lpos2 and both values
>  * look sane.
>  *
>  * They are considered insane when the difference is bigger than
>  * the data buffer size. It happens when the values are read
>  * without locking and another CPU already moved the ring buffer
>  * head and/or tail.
>  *
>  * The caller must behave carefully. The changes based on this
>  * check must be done using cmpxchg() to confirm that the check
>  * worked with valid values.
>  */
> static bool lpos1_before_lpos2_sane(struct prb_data_ring *data_ring,
> 				    unsined long lpos1, unsigned long lpos2)
> {
> 	return lpos2 - lpos1 - 1 < DATA_SIZE(data_ring);
> }
>
> Feel free to come up with any other function name or description.

I prefer "_bounded" to "_sane". And I really don't care if it is
"before" or "lt". I was only stating why I chose "before" instead of
something else. But I really don't care. Really.

My preferences would be:

lpos1_before_lpos2_bounded()

lpos1_lt_lpos2_bounded()

But I can live with lpos1_before_lpos2_sane() if you think "_sane" is
better.

>> You have overlooked that I inverted the check. It is no longer checking:
>> 
>>     next_pos <= head_pos
>> 
>> but is instead checking:
>> 
>>     !(head_pos < next_pos)
>> 
>> IOW, if "next has not overtaken head".
>
> I see. I missed this. Hmm, this would be correct when the comparsion was
> mathemathical (lt, le). But is this correct in our case when take
> into account the ring buffer wrapping?
>
> The original check returned "false" when the difference between head_lpos
> and next_lpos was bigger than the data ring size.
>
> The new check would return "true", aka "!false", in this case.

Sure, but that is not possible. Even if we assume there has been
corrupted data, the new get_data() will catch that.

> Hmm, it seems that the buffer wrapping is not possible because
> this code is called when desc_reopen_last() succeeded. And nobody
> is allowed to free reopened block.

Correct.

> Anyway, I consider using (!lpos1_before_lpos2()) as highly confusing
> in this case.

I think if you look at what the new check is checking instead of trying
to mentally map the old check to the new check, it is not confusing.

> I would either keep the code as is.

:-/ That defeats the whole purpose of the new helper, which is simply
comparing the relative position of two lpos values. That is exactly what
is being done here.

I would prefer adding an additional lpos1_le_lpos2_bounded() variant
before leaving the old code. A new variant is unnecessary, but at least
we would have all logical position comparison code together.

> Maybe we could add a comment explaining that
>
> 	if (head_lpos - next_lpos < DATA_SIZE(data_ring)) {
>
> might fail only when the substraction is negative. It should never be
> positive because head_lpos advanced more than the data buffer size
> over next_lpos because the data block is reopened and nobody could
> free it.
>
> Maybe, we could even add a check for this.

If data is being illegally manipulated underneath us, we are screwed
anyway. I see no point in sprinkling checks around in case someone is
modifying our data even though we have exclusive access to it.

> I think about fixing this in a separate patch and pushing this
> into linux-next ASAP to fix the regression.
>
> We could improve the other comparisons later...
>
> How does that sound?

Sure. Are you planning on letting 6.19 pull 2 patches or will you fold
them for the 6.19 pull?

> Should I prepare the patch for get_data() are you going to do so?

I would prefer you do it so that we do not need any more discussing for
the quick fix. ;-)

John