linux-kernel - Re: Silencing false lockdep warning related to seq lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEXW_YTiL_hsqw9f0fiXWYen8i8R=Uj+eYM8tBaV-K6Hw1CRSQ@mail.gmail.com>
Date:   Tue, 18 May 2021 11:53:57 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Boqun Feng <boqun.feng@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Laurent Dufour <ldufour@...ux.ibm.com>,
        Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: Silencing false lockdep warning related to seq lock

On Mon, May 17, 2021 at 10:24 PM Boqun Feng <boqun.feng@...il.com> wrote:
> > [...]
> > > > After apply Laurent's SPF patchset [1] , we're facing a large number
> > > > of (seemingly false positive) lockdep reports which are related to
> > > > circular dependencies with seq locks.
> > > >
> > > >  lock(A); write_seqcount(B)
> > > >   vs.
> > > > write_seqcount(B); lock(A)
> > > >
> > >
> > > Two questions here:
> > >
> > > *   Could you provide the lockdep splats you saw? I wonder whether
> > >     it's similar to the one mentioned in patch #9[1].
> >
> > Sure I have appended them to this email. Here is the tree with Laurent's
> > patches applied, in case you want to take a look:
> > https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/chromeos-5.4
> >
> > Yes, the splat is similar to the one mentioned in that patch #9, however the
> > first splat I appended below shows an issue with lockdep false positive -
> > there is clearly problem with lockdep where it thinks following sequence is
> > bad, when in fact it is not:
> >
> >     init process (skipped some functions):
> >     exec->
> >      flush_old_exec->
> >       exit_mmap ->
> >         free_pgtables->
> >           vm_write_begin(vma) // Y: acquires seq lock in write mode
> >              unlink_anon_vmas // Z: acquires anon_vma->rwsem
> >
> >     exec->
> >     load_elf_binary->
> >       -> setup_arg_pages
> >         -> expand_downwards (in the if (grow <=) block)
> >            mm->page_table_lock // X
> >            vm_write_begin(vma) // Y: acquires seq lock
> >
> >     exec->
> >      do_execve_file->
> >        ->get_user_pages
> >          -> handle_pte_fault
> >           -> anon_vma_prepare
> >               -> acquire anon_vma->rwsem  // Z
> >               -> acquire mm->page_table_lock // X
> >
> >     If vm_write_begin ever had to wait, then it could lockup like this if following happened concurrently:
> >     Acquire->TryToAcquire
> >     Y->Z
> >     X->Y
> >     Z->X
> >
> >     But Y can never result in a wait since it is a sequence lock. So this is
> >     a lockdep false positive.
> >
> > >
> > > *   What keeps write_seqcount(vm_seqcount) serialized? If it's only
> > >     one lock that serializes the writers, we probably can make it
> > >     as the nest_lock argument for seqcount_acquire(), and that will
> > >     help prevent the false positives.
> >
> > I think the write seqcount is serialized by the mmap_sem in all the code
> > paths I audited in Laurent's patchset.
> >
> > I am not sure how making it nest_lock argument will help though? The issue is
> > that lockdep's understanding of seqcount needs to relate seqcount readers to
>
> The thing lockdep will not report deadlock for the following sequences:
>
>         T1:
>         lock(A);
>         lock_nest(B, A);
>         lock(C);
>
>         T2:
>         lock(A);
>         lock(C);
>         lock_nest(B, A);
>
> because with the nest_lock, lockdep would know A serializes B, so the
> following case cannot happen:
>
>         CPU 0                   CPU 1
>         ============            ============
>         lock_nest(B,A);
>                                 lock(C);
>         lock(C);
>                                 lock_nest(B, A);
>
> becaue either CPU 0 or CPU 1 will already hold A, so they are
> serialized. So nest_lock can solve part of the problem if all the
> writers of vm_seqcount are serialized somehow.
>
> Yes, seqcount writers cannot block each other, however, they are
> supposed to be seralized with each other, right? So if we have the
> reason to believe the above two CPUs case can happen, though it's not
> a deadlock, but it's a data race, right?
>
> I think the proper fix here is to annotate vm_seqcount with the correct
> locks serializing the writers.
>

I agree with you now and that's the best way forward. I will work on
something like this (unless you already did), thanks Boqun!

-Joel