linux-kernel - Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request during shutdown

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFyPffR=kt4njmB7oG012kayrKJPsP-u5JR+s7rERjofCg@mail.gmail.com>
Date:	Fri, 25 Oct 2013 10:02:22 +0100
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Knut Petersen <Knut_Petersen@...nline.de>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Frédéric Weisbecker <fweisbec@...il.com>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:	Greg KH <greg@...ah.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	cpufreq@...r.kernel.org
Subject: Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging request
 during shutdown

Adding more people, so quoting the whole email for them.

We definitely have some module unload issues. Guys, try the following
a few times to unload modules:

    lsmod | grep ' 0 '| cut -d' ' -f1 | xargs sudo rmmod

(a few times because unloading one module will then potentially make
other modules unloadable).

On my machine, I can trigger this, for example:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 3217 at fs/sysfs/file.c:498 sysfs_attr_ns+0x91/0xa0()
  sysfs: kobject (null) without dirent
  Modules linked in: fuse nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_$
  CPU: 0 PID: 3217 Comm: rmmod Not tainted 3.12.0-rc6-00284-ge6036c0b8896 #19
  Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
   0000000000000009 ffff8800aca35df8 ffffffff8160aab5 ffff8800aca35e40
   ffff8800aca35e30 ffffffff810514b8 ffffffffa013f080 ffff8801194a6040
   0000000000000800 0000000000000000 0000000000c5b3e0 ffff8800aca35e90
  Call Trace:
   [<ffffffff8160aab5>] dump_stack+0x45/0x56
   [<ffffffff810514b8>] warn_slowpath_common+0x78/0xa0
   [<ffffffff81051527>] warn_slowpath_fmt+0x47/0x50
   [<ffffffff810b5960>] ? module_refcount+0xb0/0xb0
   [<ffffffff811e5c61>] sysfs_attr_ns+0x91/0xa0
   [<ffffffff811e5d2a>] sysfs_remove_file+0x1a/0x50
   [<ffffffff814c88a3>] cpufreq_sysfs_remove_file+0x13/0x30
   [<ffffffffa013d350>] acpi_cpufreq_exit+0x2e/0xcde [acpi_cpufreq]
   [<ffffffff810b7d1d>] SyS_delete_module+0x15d/0x2c0
   [<ffffffff81002929>] ? do_notify_resume+0x59/0x90
   [<ffffffff81618f62>] system_call_fastpath+0x16/0x1b
  ---[ end trace f887112caaa5c4ab ]---

so at least we have a cpufreq/sysfs interaction bug. There may be others.

This particular cpufreq issue may be triggered by the fact that
acpi-cpufreq isn't actually in use (pstate is). Or it might be some
generic cpufreq/sysfs bug. Rafael, Greg, ideas?

I don't see that this particular one would be the one that causes the
timer issues, but it's an example of the fact that module unload tends
to be special and not necessarily well tested.

                   Linus

On Fri, Oct 25, 2013 at 9:38 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Hmm.. I just got a run_timer_softirq oops on my own laptop, slightly
> different. That was not during shutdown, although there was a "yum
> upgrade" finishing when that happened, so it's quite likely that there
> was a service shutdown (and then restart).
>
> I think it's related. But my oops has almost no information: the IP
> that was jumped to was bogus, and the callchain is just CPU idle
> followed by the softirq -> run_timers_softirq handling, so there's no
> real way to see *what* triggered it.
>
> The bad rip was ffffffffa051e250, which is not a valid code address.
> It *might* be a module address, though. So this might be triggered by
> rmmod on some module that doesn't remove all its timers...
>
> Ideas?
>
>                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/