lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+u5dXdYm_0_LwhXg5Nw+gHXx+nPUmbYhvT=k9P4+9JRQ@mail.gmail.com>
Date: Sun, 8 Oct 2023 09:18:17 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Yajun Deng <yajun.deng@...ux.dev>
Cc: davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Alexander Lobakin <aleksander.lobakin@...el.com>
Subject: Re: [PATCH net-next v7] net/core: Introduce netdev_core_stats_inc()

On Sun, Oct 8, 2023 at 9:00 AM Yajun Deng <yajun.deng@...ux.dev> wrote:
>
>
> On 2023/10/8 14:45, Eric Dumazet wrote:
> > On Sat, Oct 7, 2023 at 8:34 AM Yajun Deng <yajun.deng@...ux.dev> wrote:
> >>
> >> On 2023/10/7 13:29, Eric Dumazet wrote:
> >>> On Sat, Oct 7, 2023 at 7:06 AM Yajun Deng <yajun.deng@...ux.dev> wrote:
> >>>> Although there is a kfree_skb_reason() helper function that can be used to
> >>>> find the reason why this skb is dropped, but most callers didn't increase
> >>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped.
> >>>>
> >>> ...
> >>>
> >>>> +
> >>>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset)
> >>>> +{
> >>>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> >>>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> >>>> +       unsigned long *field;
> >>>> +
> >>>> +       if (unlikely(!p))
> >>>> +               p = netdev_core_stats_alloc(dev);
> >>>> +
> >>>> +       if (p) {
> >>>> +               field = (unsigned long *)((void *)this_cpu_ptr(p) + offset);
> >>>> +               WRITE_ONCE(*field, READ_ONCE(*field) + 1);
> >>> This is broken...
> >>>
> >>> As I explained earlier, dev_core_stats_xxxx(dev) can be called from
> >>> many different contexts:
> >>>
> >>> 1) process contexts, where preemption and migration are allowed.
> >>> 2) interrupt contexts.
> >>>
> >>> Adding WRITE_ONCE()/READ_ONCE() is not solving potential races.
> >>>
> >>> I _think_ I already gave you how to deal with this ?
> >>
> >> Yes, I replied in v6.
> >>
> >> https://lore.kernel.org/all/e25b5f3c-bd97-56f0-de86-b93a3172870d@linux.dev/
> >>
> >>> Please try instead:
> >>>
> >>> +void netdev_core_stats_inc(struct net_device *dev, u32 offset)
> >>> +{
> >>> +       /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
> >>> +       struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
> >>> +       unsigned long __percpu *field;
> >>> +
> >>> +       if (unlikely(!p)) {
> >>> +               p = netdev_core_stats_alloc(dev);
> >>> +               if (!p)
> >>> +                       return;
> >>> +       }
> >>> +       field = (__force unsigned long __percpu *)((__force void *)p + offset);
> >>> +       this_cpu_inc(*field);
> >>> +}
> >>
> >> This wouldn't trace anything even the rx_dropped is in increasing. It
> >> needs to add an extra operation, such as:
> > I honestly do not know what you are talking about.
> >
> > Have you even tried to change your patch to use
> >
> > field = (__force unsigned long __percpu *)((__force void *)p + offset);
> > this_cpu_inc(*field);
>
>
> Yes, I tested this code. But the following couldn't show anything even
> if the rx_dropped is increasing.
>
> 'sudo python3 /usr/share/bcc/tools/trace netdev_core_stats_inc'

Well, I am not sure about this, "bpftrace" worked for me.

Make sure your toolchain generates something that looks like what I got:

000000000000ef20 <netdev_core_stats_inc>:
    ef20: f3 0f 1e fa          endbr64
    ef24: e8 00 00 00 00        call   ef29 <netdev_core_stats_inc+0x9>
ef25: R_X86_64_PLT32 __fentry__-0x4
    ef29: 55                    push   %rbp
    ef2a: 48 89 e5              mov    %rsp,%rbp
    ef2d: 53                    push   %rbx
    ef2e: 89 f3                mov    %esi,%ebx
    ef30: 48 8b 87 f0 01 00 00 mov    0x1f0(%rdi),%rax
    ef37: 48 85 c0              test   %rax,%rax
    ef3a: 74 0b                je     ef47 <netdev_core_stats_inc+0x27>
    ef3c: 89 d9                mov    %ebx,%ecx
    ef3e: 65 48 ff 04 08        incq   %gs:(%rax,%rcx,1)
    ef43: 5b                    pop    %rbx
    ef44: 5d                    pop    %rbp
    ef45: c3                    ret
    ef46: cc                    int3
    ef47: e8 00 00 00 00        call   ef4c <netdev_core_stats_inc+0x2c>
ef48: R_X86_64_PLT32 .text.unlikely.+0x13c
    ef4c: 48 85 c0              test   %rax,%rax
    ef4f: 75 eb                jne    ef3c <netdev_core_stats_inc+0x1c>
    ef51: eb f0                jmp    ef43 <netdev_core_stats_inc+0x23>
    ef53: 66 66 66 66 2e 0f 1f data16 data16 data16 cs nopw 0x0(%rax,%rax,1)
    ef5a: 84 00 00 00 00 00

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ