[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0171777a-2f4b-2b0c-4887-86f6d8563bea@oracle.com>
Date: Mon, 9 Aug 2021 13:18:17 -0700
From: Shoaib Rao <rao.shoaib@...cle.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Dmitry Vyukov <dvyukov@...gle.com>,
syzbot <syzbot+8760ca6c1ee783ac4abd@...kaller.appspotmail.com>,
andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
christian.brauner@...ntu.com, cong.wang@...edance.com,
daniel@...earbox.net, davem@...emloft.net, edumazet@...gle.com,
jamorris@...ux.microsoft.com, john.fastabend@...il.com,
kafai@...com, kpsingh@...nel.org, kuba@...nel.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
netdev@...r.kernel.org, shuah@...nel.org, songliubraving@...com,
syzkaller-bugs@...glegroups.com, yhs@...com
Subject: Re: [syzbot] BUG: sleeping function called from invalid context in
_copy_to_iter
On 8/9/21 12:57 PM, Al Viro wrote:
> On Mon, Aug 09, 2021 at 12:16:27PM -0700, Shoaib Rao wrote:
>> On 8/9/21 11:06 AM, Dmitry Vyukov wrote:
>>> On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@...cle.com> wrote:
>>>> This seems like a false positive. 1) The function will not sleep because
>>>> it only calls copy routine if the byte is present. 2). There is no
>>>> difference between this new call and the older calls in
>>>> unix_stream_read_generic().
>>> Hi Shoaib,
>>>
>>> Thanks for looking into this.
>>> Do you have any ideas on how to fix this tool's false positive? Tools
>>> with false positives are order of magnitude less useful than tools w/o
>>> false positives. E.g. do we turn it off on syzbot? But I don't
>>> remember any other false positives from "sleeping function called from
>>> invalid context" checker...
>> Before we take any action I would like to understand why the tool does not
>> single out other calls to recv_actor in unix_stream_read_generic(). The
>> context in all cases is the same. I also do not understand why the code
>> would sleep, Let's assume the user provided address is bad, the code will
>> return EFAULT, it will never sleep, if the kernel provided address is bad
>> the system will panic. The only difference I see is that the new code holds
>> 2 locks while the previous code held one lock, but the locks are acquired
>> before the call to copy.
>>
>> So please help me understand how the tool works. Even though I have
>> evaluated the code carefully, there is always a possibility that the tool is
>> correct.
> Huh???
>
> What do you mean "address is bad"? "Address is inside an area mmapped from
> NFS file". And it bloody well will sleep on attempt to read the page.
That is exactly what I said :-). There are times when copying
thread/task may sleep when the page is not there and it does not have to
be an NFS file, Linux supports mmap without backing memory and page
faults occur with files all the time. With the bad address I meant that
the user passes in an incorrect address.
>
> You should never, ever do copy_{to,from}_user() or equivalents while holding
> a spinlock, period.
Yes spinlock should not be held if the process can sleep. In this case
it wont but there is no way to indicate that. Thanks for pointing that
out, as the second lock I am holding is indeed a spinlock (it is
accessed via unix_state_unlock so I missed the spinlock). I will modify
the code and resubmit. I am glad we found the root cause.
Shoaib
Powered by blists - more mailing lists