netdev - Re: [BUG bisect] Missing Micrel driver on VF50 (net: phy: check return code when requesting PHY driver module)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMuHMdUz6J2cG-NCYPTRyS=1QZHGBWUFZPqj4ShEdioQLqu9qw@mail.gmail.com>
Date:   Tue, 22 Jan 2019 14:51:58 +0100
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Heiner Kallweit <hkallweit1@...il.com>
Cc:     Krzysztof Kozlowski <krzk@...nel.org>,
        Andrew Lunn <andrew@...n.ch>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Stefan Agner <stefan@...er.ch>,
        Linux-Renesas <linux-renesas-soc@...r.kernel.org>
Subject: Re: [BUG bisect] Missing Micrel driver on VF50 (net: phy: check
 return code when requesting PHY driver module)

Hi Heiner,

On Fri, Jan 18, 2019 at 9:58 PM Heiner Kallweit <hkallweit1@...il.com> wrote:
> On 18.01.2019 09:48, Krzysztof Kozlowski wrote:
> > On Fri, 18 Jan 2019 at 09:39, Krzysztof Kozlowski <krzk@...nel.org> wrote:
> >> On today's next (next-20190118) my Colibri VF50 board fails to boot up
> >> from network (DHCP, NFSv4 root). Looks like missing network adapter.
> >> Expected:
> >> [ 3.041773] Micrel KSZ8041 400d1000.ethernet-1:00: attached PHY driver
> >> [Micrel KSZ8041] (mii_bus:phy_addr=400d1000.ethernet-1:00, irq=POLL)
> >>
> >> Result:
> >> [ 15.614964] Root-NFS: no NFS server address
> >> [ 15.619353] VFS: Unable to mount root fs via NFS, trying floppy.
> >> [ 15.626762] VFS: Cannot open root device "nfs" or unknown-block(2,0): error -6
> >> [ 15.634252] Please append a correct "root=" boot option; here are the
> >> available partitions:
> >> [ 15.642791] 0100 16384 ram0
> >> [ 15.642804] (driver?)
> >> [ 15.649076] Kernel panic - not syncing: VFS: Unable to mount root fs
> >> on unknown-block(2,0)
> >> [ 15.657423] ---[ end Kernel panic - not syncing: VFS: Unable to mount
> >> root fs on unknown-block(2,0) ]---
> >>
> >> Bisect pointed to:
> >>     net: phy: check return code when requesting PHY driver module
> >
> > I see now in the logs:
> > [ 2.436822] mdio_bus 400d1000.ethernet-1:00: error -2 loading PHY
> > driver module for ID 0x00221513
> > which is kind of misleading. There is no initramfs so there is no
> > usermod library at this point. It is not needed. This seems to break
> > all DHCP/NFS root boots without initrd/initramfs.
> >
> Thanks for the report. Could you please provide your kernel config
> and the syslog of a boot before or w/o the patch in question?
>
> If you boot via nfs then I'd expect that the PHY driver is built-in and
> not a module. Therefore it's not fully clear to me yet why
> request_module() returns -ENOENT.

I'm seeing the same booting nfsroot on several Renesas boards.
E.g. on r8a7791/koelsch:

    mdio_bus ee700000.ethernet-ffffffff:01: error -2 loading PHY
driver module for ID 0x00221537
    sh-eth ee700000.ethernet: MDIO init failed: -2

This failure happens only when CONFIG_MODULES=y.
Reverting commit 13d0ab6750b20957 ("net: phy: check return code when
requesting PHY driver module") fixes the issue.

phy_request_driver_module() tries to load module
"mdio:00000000001000100001010100110111", which fails.
When CONFIG_MODULES=n, the error is ignored, and everything works fine.

0b00000000001000100001010100110111 == 0x00221537 == PHY_ID_KSZ8041RNLI,
which is served by drivers/net/phy/micrel.c.
Interestingly, CONFIG_MICREL_PHY=y, so I'm wondering why the PHY subsystem
tries to load a module for a driver which is already present in the first
place?

Oh, the following comment tries to explain:

        /* Request the appropriate module unconditionally; don't
         * bother trying to do so only if it isn't already loaded,
         * because that gets complicated. A hotplug event would have
         * done an unconditional modprobe anyway.

Hence request_module() failures are normal.

       ret = request_module(MDIO_MODULE_PREFIX MDIO_ID_FMT,
                            MDIO_ID_ARGS(phy_id));
       /* we only check for failures in executing the usermode binary,
        * not whether a PHY driver module exists for the PHY ID
        */
       if (IS_ENABLED(CONFIG_MODULES) && ret < 0) {
               phydev_err(dev, "error %d loading PHY driver module for
ID 0x%08x\n",
                          ret, phy_id);
               return ret;
       }

However:

    /**
     * __request_module - try to load a kernel module
     * @wait: wait (or not) for the operation to complete
     * @fmt: printf style format string for the name of the module
     * @...: arguments as specified in the format string
     *
     * Load a module using the user mode module loader. The function returns
     * zero on success or a negative errno code or positive exit code from
     * "modprobe" on failure.

So perhaps the check should be for "ret > 0"?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds