[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 1 Jan 2022 13:02:14 -0500
From: Waiman Long <longman@...hat.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Christoph Hellwig <hch@....de>
Subject: Re: [lockdep] UAF read in print_name().
On 12/30/21 10:09, Tetsuo Handa wrote:
> On 2021/12/29 12:25, Waiman Long wrote:
>> On 12/28/21 05:49, Tetsuo Handa wrote:
>>> Hello.
>>>
>>> I found using linux-next-20211210 that reading /proc/lockdep after lockdep splat
>>> triggers UAF read access. I think this is a side effect of zapping dependency
>>> information when loop driver's WQ is destroyed. You might want to xchg() the pointer
>>> with a dummy struct containing a static string.
>>>
>>> difference before lockdep splat and after lockdep splat
>>> ----------------------------------------
>>> 8635c8636
>>> < ffff88811561cd28 OPS: 26 FD: 122 BD: 1 +.+.: (wq_completion)loop0
>>> ---
>>>> ffff88811561cd28 OPS: 31 FD: 439 BD: 1 +.+.: M>^MM-^AM-^HM-^?M-^?
>> Thanks for reporting.
>>
>> Yes, listing locking classes by /proc/lockdep is racy as all_lock_classes is accessed
>> without lock protection. OTOH, we probably can't fix this race as lock hold time will be
>> too long for this case. Atomically xchg the class name is a possible workaround, but we
>> also need to add additional checks as the iteration may also be redirected to
>> free_lock_classes leading to an endless iteration loop.
> Thanks for responding. But is this bug really unfixable?
I am not saying that it is unfixable. I am just saying that we cannot
guarantee a consistent output of /proc/lockdep as internal data may
change in the middle of dumping the output.
>
> Please see the following result.
>
> ----------------------------------------
> [root@...alhost ~]# uname -r
> 5.16.0-rc4-next-20211210
> [root@...alhost ~]# grep loop /proc/lockdep
> [root@...alhost ~]# truncate -s 100m testfile
> [root@...alhost ~]# losetup -f testfile
> [root@...alhost ~]# grep loop /proc/lockdep
> ffffffffa02b73c8 OPS: 17 FD: 34 BD: 1 +.+.: loop_ctl_mutex
> ffff888106fb0528 OPS: 114 FD: 183 BD: 1 +.+.: (wq_completion)loop0
> [root@...alhost ~]# losetup -D
> [root@...alhost ~]# grep loop /proc/lockdep
> ffffffffa02b73c8 OPS: 17 FD: 34 BD: 1 +.+.: loop_ctl_mutex
> ffffffffa02b7328 OPS: 1 FD: 1 BD: 1 +.+.: loop_validate_mutex
> [root@...alhost ~]# losetup -f testfile
> [root@...alhost ~]# grep loop /proc/lockdep
> ffffffffa02b73c8 OPS: 18 FD: 34 BD: 1 +.+.: loop_ctl_mutex
> ffffffffa02b7328 OPS: 1 FD: 1 BD: 1 +.+.: loop_validate_mutex
> ffff888106fb1128 OPS: 118 FD: 183 BD: 1 +.+.: (wq_completion)loop0
> [root@...alhost ~]# losetup -D
> [root@...alhost ~]# grep loop /proc/lockdep
> ffffffffa02b73c8 OPS: 18 FD: 34 BD: 1 +.+.: loop_ctl_mutex
> ffffffffa02b7328 OPS: 2 FD: 1 BD: 1 +.+.: loop_validate_mutex
> [root@...alhost ~]# grep debug_locks /proc/lockdep_stats
> debug_locks: 1
> [root@...alhost ~]#
> ----------------------------------------
>
> We can confirm that the "(wq_completion)loop0" entry disappears when WQ for /dev/loop0 is destroyed.
>
> Then, please see the following reproducer for this lockdep problem. As you can see, there is 10
> seconds between lockdep complained and /proc/lockdep is read. 10 seconds should be enough time
> for erasing the "(wq_completion)loop0" entry.
Thanks for the reproducer.
> ----------------------------------------
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sys/ioctl.h>
> #include <linux/loop.h>
> #include <sys/sendfile.h>
>
> int main(int argc, char *argv[])
> {
> const int file_fd = open("testfile", O_RDWR | O_CREAT, 0600);
> ftruncate(file_fd, 1048576);
> char filename[128] = { };
> const int loop_num = ioctl(open("/dev/loop-control", 3), LOOP_CTL_GET_FREE, 0);
> snprintf(filename, sizeof(filename) - 1, "/dev/loop%d", loop_num);
> const int loop_fd_1 = open(filename, O_RDWR);
> ioctl(loop_fd_1, LOOP_SET_FD, file_fd);
> const int loop_fd_2 = open(filename, O_RDWR);
> ioctl(loop_fd_1, LOOP_CLR_FD, 0);
> const int sysfs_fd = open("/sys/power/resume", O_RDWR);
> sendfile(file_fd, sysfs_fd, 0, 1048576);
> sendfile(loop_fd_2, file_fd, 0, 1048576);
> write(sysfs_fd, "700", 3);
> system("/bin/cat /proc/lockdep > /tmp/lockdep-before-splat"); // Save before "zap on release" forgets the dependency.
> close(loop_fd_2);
> close(loop_fd_1); // Lockdep complains the circular dependency and turns off.
> close(file_fd);
> sleep(10);
> system("/bin/cat /proc/lockdep > /tmp/lockdep-after-splat"); // Save after "zap on release" forgot the dependency.
> return 0;
> }
> ----------------------------------------
>
> If we compare the content of /proc/lockdep before and after, we can confirm that
> the "(wq_completion)loop0" entry does not disappear even after loop device was
> destroyed. (The 'k' is POISON_FREE read out as a string.)
>
> ----------------------------------------
> # diff /tmp/lockdep-before-splat /tmp/lockdep-after-splat | tail | cat -v
> ---
>> ffffffffa02b7328 OPS: 3 FD: 1 BD: 15 +.+.: loop_validate_mutex
> 7403c7411
> < ffffffff826612d8 OPS: 4 FD: 337 BD: 1 .+.+: kn->active#135
> ---
>> ffffffff826612d8 OPS: 4 FD: 338 BD: 1 .+.+: kn->active#135
> 7411c7419
> < ffff88810422b528 OPS: 22 FD: 183 BD: 1 +.+.: (wq_completion)loop0
> ---
>> ffff88810422b528 OPS: 32 FD: 435 BD: 1 +.+.: kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkM-%M-;M-;M-;M-;M-;M-;M-;M-;
> ----------------------------------------
>
> Isn't this a bug that lockdep stopped erasing the dependency chain because
> lockdep was already turned off before start reading /proc/lockdep ?
That could be the case, I need to take a further look into the code.
>
>
> By the way, this "zap on destroy" behavior made it difficult to find a reproducer
> because "at least once during the lifetime of the kernel" part of
>
> The validator achieves perfect, mathematical 'closure' (proof of locking
> correctness) in the sense that for every simple, standalone single-task
> locking sequence that occurred at least once during the lifetime of the
> kernel, the validator proves it with a 100% certainty that no
> combination and timing of these locking sequences can cause any class of
> lock related deadlock. [*]
>
> in Documentation/locking/lockdep-design.txt became no longer applicable?
>
I believe that passage refers to validating usage of locks to make sure
that any deadlock can be detected. It is not really related to the
correctness of /proc/lockdep output which is a problem in this
particular case.
Cheers,
Longman
Powered by blists - more mailing lists