linux-kernel - Possible bug from kernel 2.6.22 and above

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <4744966C.900@jlab.org>
Date:	Wed, 21 Nov 2007 15:34:52 -0500
From:	Jie Chen <chen@...b.org>
To:	linux-kernel@...r.kernel.org
CC:	Jie Chen <chen@...b.org>
Subject: Possible bug from kernel 2.6.22 and above

Hi, there:

     We have a simple pthread program that measures the synchronization 
overheads for various synchronization mechanisms such as spin locks, 
barriers (the barrier is implemented using queue-based barrier 
algorithm) and so on. We have dual quad-core AMD opterons (barcelona) 
clusters running 2.6.23.8 kernel at this moment using Fedora Core 7 
distribution. Before we moved to this kernel, we had kernel 2.6.21. 
These two kernels are configured identical and compiled with the same 
gcc 4.1.2 compiler. Under the old kernel, we observed that the 
performance of these overheads increases as the number of threads 
increases from 2 to 8. The following are the values of total time and 
overhead for all threads acquiring a pthread spin lock and all threads 
executing a barrier synchronization call.

Kernel 2.6.21
Number of Threads              2          4           6         8
SpinLock (Time micro second)   10.5618    10.58538    10.5915   10.643
                   (Overhead)   0.073      0.05746     0.102805 
0.154563
Barrier (Time micro second)    11.020410  11.678125   11.9889   12.38002
                  (Overhead)    0.531660   1.1502      1.500112 1.891617

Each thread is bound to a particular core using pthread_setaffinity_np.

Kernel 2.6.23.8
Number of Threads              2          4           6         8
SpinLock (Time micro second)   14.849915  17.117603   14.4496   10.5990
                  (Overhead)    4.345417   6.617207    3.949435  0.110985
Barrier (Time micro second)    19.462255  20.285117   16.19395  12.37662
                  (Overhead)    8.957755   9.784722    5.699590  1.869518

It is clearly that the synchronization overhead increases as the number 
of threads increases in the kernel 2.6.21. But the synchronization 
overhead actually decreases as the number of threads increases in the 
kernel 2.6.23.8 (We observed the same behavior on kernel 2.6.22 as 
well). This certainly is not a correct behavior. The kernels are 
configured with CONFIG_SMP, CONFIG_NUMA, CONFIG_SCHED_MC, 
CONFIG_PREEMPT_NONE, CONFIG_DISCONTIGMEM set. The complete kernel 
configuration file is in the attachment of this e-mail.

 From what we have read, there was a new scheduler (CFS) appeared from 
2.6.22. We are not sure whether the above behavior is caused by the new 
scheduler.

Finally, our machine cpu information is listed in the following:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : Quad-Core AMD Opteron(tm) Processor 2347
stepping        : 10
cpu MHz         : 1909.801
cache size      : 512 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt 
pdpe1gb rdtscp
  lm 3dnowext 3dnow constant_tsc rep_good pni cx16 popcnt lahf_lm 
cmp_legacy svm
extapic cr8_legacy altmovcr8 abm sse4a misalignsse 3dnowprefetch osvw
bogomips        : 3822.95
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

In addition, we have schedstat and sched_debug files in the /proc 
directory.

Thank you for all your help to solve this puzzle. If you need more 
information, please let us know.

P.S. I like to be cc'ed on the discussions related to this problem.

###############################################
Jie Chen
Scientific Computing Group
Thomas Jefferson National Accelerator Facility
12000, Jefferson Ave.
Newport News, VA 23606

(757)269-5046 (office) (757)269-6248 (fax)
chen@...b.org
###############################################

View attachment "kernel-2.6.23.8-config" of type "text/plain" (19971 bytes)