linux-kernel - Re: [PATCH] eventfd: avoid overflow to ULLONG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tencent_8B551E757BD0C0E46A7A23B3AA5FD4336206@qq.com>
Date:   Wed, 12 Jul 2023 02:14:58 +0800
From:   Wen Yang <wenyang.linux@...mail.com>
To:     Christian Brauner <brauner@...nel.org>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        Dylan Yudaken <dylany@...com>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Matthew Wilcox <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] eventfd: avoid overflow to ULLONG_MAX when ctx->count is
 0


On 2023/7/11 17:39, Christian Brauner wrote:
> On Mon, Jul 10, 2023 at 11:02:33PM +0800, Wen Yang wrote:
>> On 2023/7/10 22:12, Christian Brauner wrote:
>>> On Sun, Jul 09, 2023 at 02:54:51PM +0800, wenyang.linux@...mail.com wrote:
>>>> From: Wen Yang <wenyang.linux@...mail.com>
>>>>
>>>> For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling
>>>> eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
>>>>
>>>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>>> Signed-off-by: Wen Yang <wenyang.linux@...mail.com>
>>>> Cc: Alexander Viro <viro@...iv.linux.org.uk>
>>>> Cc: Jens Axboe <axboe@...nel.dk>
>>>> Cc: Christian Brauner <brauner@...nel.org>
>>>> Cc: Christoph Hellwig <hch@....de>
>>>> Cc: Dylan Yudaken <dylany@...com>
>>>> Cc: David Woodhouse <dwmw@...zon.co.uk>
>>>> Cc: Matthew Wilcox <willy@...radead.org>
>>>> Cc: linux-fsdevel@...r.kernel.org
>>>> Cc: linux-kernel@...r.kernel.org
>>>> ---
>>> So this looks ok but I would like to see an analysis how the overflow
>>> can happen. I'm looking at the callers and it seems that once ctx->count
>>> hits 0 eventfd_read() won't call eventfd_ctx_do_read() anymore. So is
>>> there a caller that can call directly or indirectly
>>> eventfd_ctx_do_read() on a ctx->count == 0?
>> eventfd_read() ensures that ctx->count is not 0 before calling
>> eventfd_ctx_do_read() and it is correct.
>>
>> But it is not appropriate for eventfd_ctx_remove_wait_queue() to call
>> eventfd_ctx_do_read() unconditionally,
>>
>> as it may not only causes ctx->count to overflow, but also unnecessarily
>> calls wake_up_locked_poll().
>>
>>
>> I am sorry for just adding the following string in the patch:
>> Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
>>
>>
>> Looking forward to your suggestions.
>>
>> --
>>
>> Best wishes,
>>
>> Wen
>>
>>
>>> I'm just slightly skeptical about patches that fix issues without an
>>> analysis how this can happen.
>>>
>>>>    fs/eventfd.c | 4 +++-
>>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/eventfd.c b/fs/eventfd.c
>>>> index 8aa36cd37351..10a101df19cd 100644
>>>> --- a/fs/eventfd.c
>>>> +++ b/fs/eventfd.c
>>>> @@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
>>>>    {
>>>>    	lockdep_assert_held(&ctx->wqh.lock);
>>>> -	*cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>>>> +	*cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
>>>>    	ctx->count -= *cnt;
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
>>>> @@ -269,6 +269,8 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
>>>>    		return -EFAULT;
>>>>    	if (ucnt == ULLONG_MAX)
>>>>    		return -EINVAL;
>>>> +	if ((ctx->flags & EFD_SEMAPHORE) && !ucnt)
>>>> +		return -EINVAL;
> Hm, why is bit necessary though? What's wrong with specifying ucnt == 0
> with EFD_SEMAPHORE? This also looks like a (very low potential) uapi
> break.


Thank you for your careful review.

As you pointed out, this may break the uapi.
Although manaul has the following description (man 2 eventfd):
*  The file descriptor is readable (the select(2) readfds argument; the 
poll(2) POLLIN flag) if the counter has a value greater than 0.
*  The file descriptor is writable (the select(2) writefds argument; the 
poll(2) POLLOUT flag) if it is possible to write a value of at least "1" 
without blocking.

But it does not specify that the ucnt cannot be zero, so we can only 
delete the two lines of code above

Could we propose another patch specifically to address the issue you 
have identified?

Since there are indeed some corner scenes when ucnt is 0 and ctx->count 
is also 0:


static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (ULLONG_MAX - ctx->count > ucnt)
                 res = sizeof(ucnt);                 ---> always > 0
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;                 ---> unnecessary 
addition of 0
                 current->in_eventfd = 1;            ---> May affect 
eventfd_signal()
                 if (waitqueue_active(&ctx->wqh))
                         wake_up_locked_poll(&ctx->wqh, EPOLLIN);  ---> 
heavyweight wakeup
                 current->in_eventfd = 0;
         }
...
}


static __poll_t eventfd_poll(struct file *file, poll_table *wait)
{
...
         count = READ_ONCE(ctx->count);

         if (count > 0)
                 events |= EPOLLIN;    ---> If count is 0, all previous 
operations are wasted

         if (count == ULLONG_MAX)
                 events |= EPOLLERR;
         if (ULLONG_MAX - 1 > count)
                 events |= EPOLLOUT;

         return events;
}

Could we optimize it like this?

static ssize_t eventfd_write(struct file *file, const char __user *buf, 
size_t count,
                              loff_t *ppos)
{
...
         if (likely(res > 0)) {
                 ctx->count += ucnt;
                 if (ctx->count) {                       ---> avoiding 
unnecessary heavyweight operations
                         current->in_eventfd = 1;
                         if (waitqueue_active(&ctx->wqh))
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
                         current->in_eventfd = 0;
                 }
         }
...
}


--

Best wishes,

Wen