lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA3_Gnqo37RxLi2McF0=oRPZSw_P3Kya_3m3JBA2s6c0vaf5sw@mail.gmail.com>
Date: Thu, 15 Jan 2026 19:33:36 +0900
From: 戸田晃太 <kota.toda@...-cybersecurity.com>
To: Eric Dumazet <edumazet@...gle.com>, pabeni@...hat.com, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Cc: 小池悠生 <yuki.koike@...-cybersecurity.com>, 
	戸田晃太 <kota.toda@...-cybersecurity.com>
Subject: Re: [PATCH net] bonding: Fix header_ops type confusion

Hello, Eric and other maintainers,

I hope you’re doing well. I’m following up on our email, sent during
the holiday season, in case it got buried.

When you have a moment, could you please let us know if you had a
chance to review it?

Thank you in advance, and I look forward to your response.

Best regards,
Kota Toda

2025年12月22日(月) 17:20 小池悠生 <yuki.koike@...-cybersecurity.com>:
>
> Hello, Eric and other maintainers,
>
> I'm deeply sorry to have left the patch suggestion for this long period.
> I became extremely busy, and that took its toll on my health, causing
> me to take sick leave for nearly half a year (And my colleague Kota
> had been waiting for me to come back).
> As fortunately I've recovered and returned to work, I hope to move
> forward with this matter as well.
>
> Recalling the issue Eric raised, I understand it was a concern about
> potential race conditions arising from the `bond_header_ops` and
> `header_slave_dev` added to the `struct bonding`. For example, one
> could imagine a situation where `header_slave_dev` is rewritten to a
> different type, and at that exact moment a function from the old
> `bond_header_ops` gets called, or vice versa.
>
> However, I am actually skeptical that this can happen. The reason is
> that `bond_setup_by_slave` is only called when there are no slaves at
> all:
>
> ```
> bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
>              struct netlink_ext_ack *extack)
> ...
>   if (!bond_has_slaves(bond)) {
> ...
>       if (slave_dev->type != ARPHRD_ETHER)
>         bond_setup_by_slave(bond_dev, slave_dev);
> ```
>
> In other words, in order to trigger a race condition, one would need
> to remove the slave once and make the slave list empty first. However,
> as shown below, in `bond_release_and_destroy`, when the slave list
> becomes empty, it appears that the bond interface itself is removed.
> This makes it seem impossible to "quickly remove a slave and
> re-register it.":
>
> ```
> static int bond_slave_netdev_event(unsigned long event,
>            struct net_device *slave_dev)
> ...
>   switch (event) {
>   case NETDEV_UNREGISTER:
>     if (bond_dev->type != ARPHRD_ETHER)
>       bond_release_and_destroy(bond_dev, slave_dev);
> ...
> }
> ...
> /* First release a slave and then destroy the bond if no more slaves are left.
>  * Must be under rtnl_lock when this function is called.
>  */
> static int bond_release_and_destroy(struct net_device *bond_dev,
>             struct net_device *slave_dev)
> {
>   struct bonding *bond = netdev_priv(bond_dev);
>   int ret;
>
>   ret = __bond_release_one(bond_dev, slave_dev, false, true);
>   if (ret == 0 && !bond_has_slaves(bond) &&
>       bond_dev->reg_state != NETREG_UNREGISTERING) {
>     bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
>     netdev_info(bond_dev, "Destroying bond\n");
>     bond_remove_proc_entry(bond);
>     unregister_netdevice(bond_dev);
>   }
>   return ret;
> }
> ```
>
> Moreover, as noted in the comments, these functions are executed under
> the netlink-side lock. Therefore, my conclusion is that a race
> condition cannot actually occur. I also think that the fact that, even
> before our patch, these code paths had almost no explicit locking
> anywhere serves as circumstantial evidence for this view. As Kota
> said, as far as I saw, the past syzkaller-bot's report is seemingly
> only NULL pointer dereference due to the root cause we reported, and
> this patch should fix them.
>
> That said, even so, I agree that the kind of countermeasures Eric
> suggests are worth applying if they do not cause problems in terms of
> execution speed or code size. However, I am concerned that addressing
> this with READ_ONCE or RCU would imply a somewhat large amount of
> rewriting.
> `header_ops` is designed to allow various types of devices to be
> handled in an object-oriented way, and as such it is used throughout
> many parts of the Linux kernel. Using READ_ONCE or RCU every time
> header_ops is accessed simply because we are worried about a race
> condition in bond’s header_ops seems to imply changes like the
> following, for example:
>
> ```
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 92dc1f1788de..d9aad38746ad 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1538,7 +1538,7 @@ static void neigh_hh_init(struct neighbour *n)
>   * hh_cache entry.
>   */
>   if (!hh->hh_len)
> - dev->header_ops->cache(n, hh, prot);
> + READ_ONCE(dev->header_ops->cache)(n, hh, prot);
>
>   write_unlock_bh(&n->lock);
>  }
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -3131,35 +3131,41 @@ static inline int dev_hard_header(struct
> sk_buff *skb, struct net_device *dev,
>    const void *daddr, const void *saddr,
>    unsigned int len)
>  {
> - if (!dev->header_ops || !dev->header_ops->create)
> + int (*create)(struct sk_buff *skb, struct net_device *dev,
> +      unsigned short type, const void *daddr,
> +      const void *saddr, unsigned int len);
> + if (!dev->header_ops || !(create = READ_ONCE(dev->header_ops->create)))
>   return 0;
>
> - return dev->header_ops->create(skb, dev, type, daddr, saddr, len);
> + return create(skb, dev, type, daddr, saddr, len);
>  }
>
>  static inline int dev_parse_header(const struct sk_buff *skb,
>     unsigned char *haddr)
>  {
> + int (*parse)(const struct sk_buff *skb, unsigned char *haddr);
>   const struct net_device *dev = skb->dev;
>
> - if (!dev->header_ops || !dev->header_ops->parse)
> + if (!dev->header_ops || !(parse = READ_ONCE(dev->header_ops->parse)))
>   return 0;
> - return dev->header_ops->parse(skb, haddr);
> + return parse(skb, haddr);
>  }
> ... (and so on)
> ```
>
> It looks like we would end up rewriting on the order of a dozen or so
> places with this kind of pattern, but from the perspective of the
> maintainers (or in terms of Linux kernel culture), would this be
> considered an acceptable change?
> If this differs from what you intended, please correct me.
>
> Best regards,
> Yuki Koike
>
> 2025年5月29日(木) 0:10 Eric Dumazet <edumazet@...gle.com>:
> >
> > On Wed, May 28, 2025 at 7:36 AM 戸田晃太 <kota.toda@...-cybersecurity.com> wrote:
> > >
> > > Thank you for your review.
> > >
> > > 2025年5月26日(月) 17:23 Eric Dumazet <edumazet@...gle.com>:
> > > >
> > > > On Sun, May 25, 2025 at 10:08 PM 戸田晃太 <kota.toda@...-cybersecurity.com> wrote:
> > > > >
> > > > > In bond_setup_by_slave(), the slave’s header_ops are unconditionally
> > > > > copied into the bonding device. As a result, the bonding device may invoke
> > > > > the slave-specific header operations on itself, causing
> > > > > netdev_priv(bond_dev) (a struct bonding) to be incorrectly interpreted
> > > > > as the slave's private-data type.
> > > > >
> > > > > This type-confusion bug can lead to out-of-bounds writes into the skb,
> > > > > resulting in memory corruption.
> > > > >
> > > > > This patch adds two members to struct bonding, bond_header_ops and
> > > > > header_slave_dev, to avoid type-confusion while keeping track of the
> > > > > slave's header_ops.
> > > > >
> > > > > Fixes: 1284cd3a2b740 (bonding: two small fixes for IPoIB support)
> > > > > Signed-off-by: Kota Toda <kota.toda@...-cybersecurity.com>
> > > > > Signed-off-by: Yuki Koike <yuki.koike@...-cybersecurity.com>
> > > > > Co-Developed-by: Yuki Koike <yuki.koike@...-cybersecurity.com>
> > > > > Reviewed-by: Paolo Abeni <pabeni@...hat.com>
> > > > > Reported-by: Kota Toda <kota.toda@...-cybersecurity.com>
> > > > > ---
> > > > >  drivers/net/bonding/bond_main.c | 61
> > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > > >  include/net/bonding.h           |  5 +++++
> > > > >  2 files changed, 65 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> > > > > index 8ea183da8d53..690f3e0971d0 100644
> > > > > --- a/drivers/net/bonding/bond_main.c
> > > > > +++ b/drivers/net/bonding/bond_main.c
> > > > > @@ -1619,14 +1619,65 @@ static void bond_compute_features(struct bonding *bond)
> > > > >      netdev_change_features(bond_dev);
> > > > >  }
> > > > >
> > > > > +static int bond_hard_header(struct sk_buff *skb, struct net_device *dev,
> > > > > +        unsigned short type, const void *daddr,
> > > > > +        const void *saddr, unsigned int len)
> > > > > +{
> > > > > +    struct bonding *bond = netdev_priv(dev);
> > > > > +    struct net_device *slave_dev;
> > > > > +
> > > > > +    slave_dev = bond->header_slave_dev;
> > > > > +
> > > > > +    return dev_hard_header(skb, slave_dev, type, daddr, saddr, len);
> > > > > +}
> > > > > +
> > > > > +static void bond_header_cache_update(struct hh_cache *hh, const
> > > > > struct net_device *dev,
> > > > > +        const unsigned char *haddr)
> > > > > +{
> > > > > +    const struct bonding *bond = netdev_priv(dev);
> > > > > +    struct net_device *slave_dev;
> > > > > +
> > > > > +    slave_dev = bond->header_slave_dev;
> > > >
> > > > I do not see any barrier ?
> > > >
> > > > > +
> > > > > +    if (!slave_dev->header_ops || !slave_dev->header_ops->cache_update)
> > > > > +        return;
> > > > > +
> > > > > +    slave_dev->header_ops->cache_update(hh, slave_dev, haddr);
> > > > > +}
> > > > > +
> > > > >  static void bond_setup_by_slave(struct net_device *bond_dev,
> > > > >                  struct net_device *slave_dev)
> > > > >  {
> > > > > +    struct bonding *bond = netdev_priv(bond_dev);
> > > > >      bool was_up = !!(bond_dev->flags & IFF_UP);
> > > > >
> > > > >      dev_close(bond_dev);
> > > > >
> > > > > -    bond_dev->header_ops        = slave_dev->header_ops;
> > > > > +    /* Some functions are given dev as an argument
> > > > > +     * while others not. When dev is not given, we cannot
> > > > > +     * find out what is the slave device through struct bonding
> > > > > +     * (the private data of bond_dev). Therefore, we need a raw
> > > > > +     * header_ops variable instead of its pointer to const header_ops
> > > > > +     * and assign slave's functions directly.
> > > > > +     * For the other case, we set the wrapper functions that pass
> > > > > +     * slave_dev to the wrapped functions.
> > > > > +     */
> > > > > +    bond->bond_header_ops.create = bond_hard_header;
> > > > > +    bond->bond_header_ops.cache_update = bond_header_cache_update;
> > > > > +    if (slave_dev->header_ops) {
> > > > > +        bond->bond_header_ops.parse = slave_dev->header_ops->parse;
> > > > > +        bond->bond_header_ops.cache = slave_dev->header_ops->cache;
> > > > > +        bond->bond_header_ops.validate = slave_dev->header_ops->validate;
> > > > > +        bond->bond_header_ops.parse_protocol =
> > > > > slave_dev->header_ops->parse_protocol;
> > > >
> > > > All these updates probably need WRITE_ONCE(), and corresponding
> > > > READ_ONCE() on reader sides, at a very minimum ...
> > > >
> > > > RCU would even be better later.
> > > >
> > > I believe that locking is not necessary in this patch. The update of
> > > `header_ops` only happens when a slave is newly enslaved to a bond.
> > > Under such circumstances, members of `header_ops` are not called in
> > > parallel with updating. Therefore, there is no possibility of race
> > > conditions occurring.
> >
> > bond_dev can certainly be live, and packets can flow.
> >
> > I have seen enough syzbot reports hinting at this precise issue.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ