[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <77a66454-8d18-6a92-803b-76273ec998eb@kernel.dk>
Date: Thu, 29 Sep 2022 10:54:07 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Jan Kara <jack@...e.cz>, Vlastimil Babka <vbabka@...e.cz>
Cc: syzbot <syzbot+dfcc5f4da15868df7d4d@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, keescook@...omium.org,
linux-kernel@...r.kernel.org, mark.rutland@....com,
mhiramat@...nel.org, rostedt@...dmis.org,
syzkaller-bugs@...glegroups.com,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Michal Hocko <mhocko@...e.com>,
Amir Goldstein <amir73il@...il.com>,
Matthew Bobrowski <repnop@...gle.com>,
Linux-FSDevel <linux-fsdevel@...r.kernel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Matthew Wilcox <willy@...radead.org>, io-uring@...r.kernel.org
Subject: Re: [syzbot] inconsistent lock state in kmem_cache_alloc
On 9/29/22 8:07 AM, Jens Axboe wrote:
> On 9/29/22 7:56 AM, Jan Kara wrote:
>> On Thu 29-09-22 15:24:22, Vlastimil Babka wrote:
>>> On 9/26/22 18:33, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
>>>> git tree: upstream
>>>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+dfcc5f4da15868df7d4d@...kaller.appspotmail.com
>>>
>>> +CC more folks
>>>
>>> I'm not fully sure what this report means but I assume it's because there's
>>> a GFP_KERNEL kmalloc() allocation from softirq context? Should it perhaps
>>> use memalloc_nofs_save() at some well defined point?
>>
>> Thanks for the CC. The problem really is that io_uring is calling into
>> fsnotify_access() from softirq context. That isn't going to work. The
>> allocation is just a tip of the iceberg. Fsnotify simply does not expect to
>> be called from softirq context. All the dcache locks are not IRQ safe, it
>> can even obtain some sleeping locks and call to userspace if there are
>> suitable watches set up.
>>
>> So either io_uring needs to postpone fsnotify calls to a workqueue or we
>> need a way for io_uring code to tell iomap dio code that the completion
>> needs to always happen from a workqueue (as it currently does for writes).
>> Jens?
>
> Something like this should probably work - I'll write a test case and
> vet it.
Ran that with the attached test case, triggers it before but not with
the patch. Side note - I do wish that the syzbot reproducers were not
x86 specific, I always have to go and edit them for arm64. For this
particular one, I just gave up and wrote one myself.
Thanks for the heads-up Jan, I'll queue up this fix and mark for stable
with the right attributions.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/fanotify.h>
#include <sys/wait.h>
#include <liburing.h>
int main(int argc, char *argv[])
{
struct io_uring_sqe *sqe;
struct io_uring_cqe *cqe;
struct io_uring ring;
int fan, ret, fd;
void *buf;
fan = fanotify_init(FAN_CLASS_NOTIF|FAN_CLASS_CONTENT, 0);
if (fan < 0) {
if (errno == ENOSYS)
return 0;
perror("fanotify_init");
return 1;
}
if (argc > 1) {
fd = open(argv[1], O_RDONLY | O_DIRECT);
if (fd < 0) {
perror("open");
return 1;
}
} else {
fd = open("file0", O_RDONLY | O_DIRECT);
if (fd < 0) {
perror("open");
return 1;
}
}
ret = fanotify_mark(fan, FAN_MARK_ADD, FAN_ACCESS|FAN_MODIFY, fd, NULL);
if (ret < 0) {
perror("fanotify_mark");
return 1;
}
ret = 0;
if (fork()) {
int wstat;
io_uring_queue_init(4, &ring, 0);
if (posix_memalign(&buf, 4096, 4096))
return 0;
sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd, buf, 4096, 0);
io_uring_submit(&ring);
ret = io_uring_wait_cqe(&ring, &cqe);
if (ret) {
fprintf(stderr, "wait_ret=%d\n", ret);
return 1;
}
wait(&wstat);
ret = WEXITSTATUS(wstat);
} else {
struct fanotify_event_metadata m;
int fret;
fret = read(fan, &m, sizeof(m));
if (fret < 0)
perror("fanotify read");
/* fail if mask isn't right or pid indicates non-task context */
else if (!(m.mask & 1) || !m.pid)
exit(1);
exit(0);
}
return ret;
}
--
Jens Axboe
Powered by blists - more mailing lists