lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKFNMonQwnHj=QnaBGb3s-V8hL1zbxrogLU4yutPd1dMfoCBMA@mail.gmail.com>
Date:   Tue, 15 Mar 2022 18:11:13 +0900
From:   Ryusuke Konishi <konishi.ryusuke@...il.com>
To:     Dongliang Mu <mudongliangabcd@...il.com>
Cc:     Pavel Skripkin <paskripkin@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-nilfs <linux-nilfs@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Nanyong Sun <sunnanyong@...wei.com>,
        慕冬亮 <dzm91@...t.edu.cn>
Subject: Re: Fw:Re: [PATCH] fs: nilfs2: fix memory leak in nilfs sysfs create
 device group

Hi Dongliang,

On Tue, Mar 15, 2022 at 2:50 PM Dongliang Mu <mudongliangabcd@...il.com> wrote:
>
> On Tue, Mar 15, 2022 at 12:46 PM Ryusuke Konishi
> <konishi.ryusuke@...il.com> wrote:
> >
> > On Tue, Mar 15, 2022 at 10:59 AM Dongliang Mu <mudongliangabcd@...il.com> wrote:
> > >
> > > On Sun, Mar 13, 2022 at 9:35 PM Dongliang Mu <mudongliangabcd@...il.com> wrote:
> > > >
> > > > On Sun, Mar 13, 2022 at 12:01 AM Ryusuke Konishi
> > > > <konishi.ryusuke@...il.com> wrote:
> > > > >
> > > > > Hi Pavel and Dongliang,
> > > > >
> > > > > On Sun, Mar 13, 2022 at 12:16 AM Pavel Skripkin <paskripkin@...il.com> wrote:
> > > > > >
> > > > > > Hi Ryusuke,
> > > > > >
> > > > > > On 3/12/22 18:11, Ryusuke Konishi wrote:
> > > > > > >> In case of nilfs_attach_log_writer() error code jumps to
> > > > > > >> failed_checkpoint label and calls destroy_nilfs() which should call
> > > > > > >> nilfs_sysfs_delete_device_group().
> > > > > > >
> > > > > > > nilfs_sysfs_delete_device_group() is called in destroy_nilfs()
> > > > > > > if nilfs->ns_flags has THE_NILFS_INIT flag -- nilfs_init() inline
> > > > > > > function tests this flag.
> > > > > > >
> > > > > > > The flag is set after init_nilfs() succeeded at the beginning of
> > > > > > > nilfs_fill_super() because the set_nilfs_init() inline in init_nilfs() sets it.
> > > > > > >
> > > > > > > So,  nilfs_sysfs_delete_group() seems to be called in case of
> > > > > > > the above failure.   Am I missing something?
> > > > > > >
> > > > > >
> > > > > > Yeah, that's what I mean :) I can't see how reported issue is possible
> > > > > > with current code.
> > > > > >
> > > > > >
> > > > > > Sorry for not being clear
> > > > >
> > > > > Understood, thanks for the reply.
> > > > >
> > > > > If so,  the case where nilfs_sysfs_create_device_group() itself failed,
> > > > > is suspicious as mentioned in the previous mail.   A possible scenario
> > > > > I guess is :
> > > > >
> > > > > - nilfs_sysfs_create_device_group() on the first mount try fails and leaks
> > > > >   due to lack of kobject_del() in the error path.
> > > > > - Then, nilfs_sysfs_create_device_group() on the next mount try hits
> > > > >   the leak detector at kobject_init_and_add().
> > > > >
> > > > > So, if the leak bug is reproducible, I'd like to ask Dongliang to
> > > > > test the effect of the first patch.
> > > >
> > > > If my local syzkaller instance gets a reproducer, I will try to do this.
> > > >
> > > > >
> > > > > Regards,
> > > > > Ryusuke Konishi
> > >
> > > Hi Ryusuke,
> > >
> > > The crash still occurred in my newly set up syzkaller instance. It
> > > appears after two days' fuzzing.
> > >
> > > I remember you suggested me to add kobject_del just for testing,
> > > right? And let's see if this crash still occurs any more.
> >
> > You need a few days to reproduce it ?
> > If so, I think this confirmation method is uncertain.
> > In that case, I will try inserting an artificial error by changing
> > nilfs_sysfs_create_device_group() a bit to confirm if the same crash occurs.

I tried to change the code of nilfs_sysfs_create_device_group() so that
an error occurs once every two times.
As a result, the leak bug was not reproduced.

In addition, by kobject debug messages, I saw that the device name
("loop2" in your case) was properly freed through kobject_put() even in
the erroneous case.

So, my previous guess was wrong.
Looks like there is another cause for the leak of the device name.
It may not be a nilfs2 issue, I don't know.

> I am reproducing another bug [1] recently. If you can spare some time
> figuring out the underlying issue, that's really great. Or we can wait
> some time for the bug to disclose more, after all, it is only a rare
> memory leak.
>
> [1] https://syzkaller.appspot.com/bug?extid=045796dbe294d53147e6

According to the log, it looks like "erofs_put_super() ->
erofs_unregister_sysfs()" hits:

  kobject: '(null)' (ffff88807b550190): is not initialized, yet
kobject_put() is being called.

This warning is output in kobject_put() if kobj argument is not in
'state_initialized':

  void kobject_put(struct kobject *kobj)
  {
         if (kobj) {
                    if (!kobj->state_initialized)
                              WARN(1, KERN_WARNING
                                "kobject: '%s' (%p): is not
initialized, yet kobject_put() is being called.\n",
                             kobject_name(kobj), kobj);
                    kref_put(&kobj->kref, kobject_release);
         }
  }

How about chasing this abnormal condition ?
Anyway, please ask erofs maintainers and linux-erofs mailing list for this.

Regards,
Ryusuke Konishi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ