[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<VI1P193MB0752F2E6DA346CDB2BCDBED899D6A@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM>
Date: Tue, 17 Oct 2023 19:49:15 +0800
From: Juntong Deng <juntong.deng@...look.com>
To: Paolo Abeni <pabeni@...hat.com>, borisp@...dia.com,
john.fastabend@...il.com, kuba@...nel.org, davem@...emloft.net,
edumazet@...gle.com
Cc: netdev@...r.kernel.org, linux-kernel-mentees@...ts.linuxfoundation.org,
linux-kernel@...r.kernel.org,
syzbot+29c22ea2d6b2c5fd2eae@...kaller.appspotmail.com
Subject: Re: [PATCH] net/tls: Fix slab-use-after-free in tls_encrypt_done
On 2023/10/17 18:31, Paolo Abeni wrote:
> On Thu, 2023-10-12 at 19:02 +0800, Juntong Deng wrote:
>> In the current implementation, ctx->async_wait.completion is completed
>> after spin_lock_bh, which causes tls_sw_release_resources_tx to
>> continue executing and return to tls_sk_proto_cleanup, then return
>> to tls_sk_proto_close, and after that enter tls_sw_free_ctx_tx to kfree
>> the entire struct tls_context (including ctx->encrypt_compl_lock).
>>
>> Since ctx->encrypt_compl_lock has been freed, subsequent spin_unlock_bh
>> will result in slab-use-after-free error. Due to SMP, even using
>> spin_lock_bh does not prevent tls_sw_release_resources_tx from continuing
>> on other CPUs. After tls_sw_release_resources_tx is woken up, there is no
>> attempt to hold ctx->encrypt_compl_lock again, therefore everything
>> described above is possible.
>>
>> The fix is to put complete(&ctx->async_wait.completion) after
>> spin_unlock_bh, making the release after the unlock. Since complete is
>> only executed if pending is 0, which means this is the last record, there
>> is no need to worry about race condition causing duplicate completes.
>>
>> Reported-by: syzbot+29c22ea2d6b2c5fd2eae@...kaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=29c22ea2d6b2c5fd2eae
>> Signed-off-by: Juntong Deng <juntong.deng@...look.com>
>
> Have you tested this patch vs the syzbot reproducer?
>
> I think the following race is still present:
>
> CPU0 CPU1
> tls_sw_release_resources_tx tls_encrypt_done
> spin_lock_bh
> spin_unlock_bh
> spin_lock_bh
> spin_unlock_bh
> complete
>
> wait
> // ...
> tls_sk_proto_close
>
> test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask
> // UaF
>
> regardless of 'complete()' being invoked before or after the
> 'spin_unlock_bh()'.
>
> Paolo
>
Yes, I think you are right.
My previous thought was that test_and_set_bit() is only called if
'ready' is true, but 'ready' will only be true on the first record,
and complete() is only called when processing the last record.
I simply thought before that the first record would not be the last
record, so I thought before that the test_and_set_bit() would not be
called when complete() was called.
But your reply inspired me and I thought about it carefully and the
situation with only one record is possible.
I will make version 2 patch to solve this problem.
Thanks.
Powered by blists - more mailing lists