netdev - Re: [PATCH net] Fix an intermittent pr_emerg warning about lo becoming free.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAM_iQpXHUjxTF6=heY9psJb5pFfm4tRz9KiLOB87CbwEA1K=3A@mail.gmail.com>
Date:   Fri, 9 Jun 2017 14:48:49 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Krister Johansen <kjlx@...pleofstupid.com>
Cc:     David Miller <davem@...emloft.net>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Kaiwen Xu <kaiwen.xu@...u.com>
Subject: Re: [PATCH net] Fix an intermittent pr_emerg warning about lo
 becoming free.

On Fri, Jun 9, 2017 at 11:43 AM, Krister Johansen
<kjlx@...pleofstupid.com> wrote:
> On Fri, Jun 09, 2017 at 11:18:44AM -0700, Cong Wang wrote:
>> On Thu, Jun 8, 2017 at 1:12 PM, Krister Johansen
>> <kjlx@...pleofstupid.com> wrote:
>> > The way this works is that if there's still a reference on the dst entry
>> > at the time we try to free it, it gets placed in the gc list by
>> > __dst_free and the dst_destroy() call is invoked by the gc task once the
>> > refcount is 0.  If the gc task processes a 10th or less of its entries
>> > on a single pass, it inreases the amount of time it waits between gc
>> > intervals.
>> >
>> > Looking at the gc_task intervals, they started at 663ms when we invoked
>> > __dst_free().  After that, they increased to 1663, 3136, 5567, 8191,
>> > 10751, and 14848.  The release that set the refcnt to 0 on our dst entry
>> > occurred after the gc_task was enqueued for 14 second interval so we had
>> > to wait longer than the warning time in wait_allrefs in order for the
>> > dst entry to get free'd and the hold on 'lo' to be released.
>> >
>>
>> I am glad to see you don't have a dst leak here.
>>
>> But from my experience of a similar bug (refcnt wait on lo), this goes
>> infinitely rather than just 14sec, so it looked more like a real leak than
>> just a gc delay. So in your case, this annoying warning eventually
>> disappears, right?
>
> That's correct.  The problem occurs intermittently, and the warnings are
> less frequent than the interval in netdev_wait_allrefs().  At least when
> I observed it, it tended to conincide with our controlplane canary
> issuing an API call that lead to a network namespace teardown on the
> dataplane.

Great! Then the bug I saw is different from this one and it is probably
a dst leak.

Thanks.