lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 May 2020 07:33:41 -0700
From:   Eric Dumazet <>
To:     SeongJae Park <>
Cc:     "Paul E. McKenney" <>,
        Eric Dumazet <>,
        David Miller <>,
        Al Viro <>,
        Jakub Kicinski <>,
        Greg Kroah-Hartman <>,, netdev <>,
        LKML <>,
        SeongJae Park <>,,,
Subject: Re: Re: Re: Re: Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life
 cycle change

On Wed, May 6, 2020 at 5:59 AM SeongJae Park <> wrote:
> TL; DR: It was not kernel's fault, but the benchmark program.
> So, the problem is reproducible using the lebench[1] only.  I carefully read
> it's code again.
> Before running the problem occurred "poll big" sub test, lebench executes
> "context switch" sub test.  For the test, it sets the cpu affinity[2] and
> process priority[3] of itself to '0' and '-20', respectively.  However, it
> doesn't restore the values to original value even after the "context switch" is
> finished.  For the reason, "select big" sub test also run binded on CPU 0 and
> has lowest nice value.  Therefore, it can disturb the RCU callback thread for
> the CPU 0, which processes the deferred deallocations of the sockets, and as a
> result it triggers the OOM.
> We confirmed the problem disappears by offloading the RCU callbacks from the
> CPU 0 using rcu_nocbs=0 boot parameter or simply restoring the affinity and/or
> priority.
> Someone _might_ still argue that this is kernel problem because the problem
> didn't occur on the old kernels prior to the Al's patches.  However, setting
> the affinity and priority was available because the program received the
> permission.  Therefore, it would be reasonable to blame the system
> administrators rather than the kernel.
> So, please ignore this patchset, apology for making confuse.  If you still has
> some doubts or need more tests, please let me know.
> [1]
> [2]
> [3]
> Thanks,
> SeongJae Park

No harm done, thanks for running more tests and root-causing the issue !

Powered by blists - more mailing lists