lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241101110810.R3AnEqdu@linutronix.de>
Date: Fri, 1 Nov 2024 12:08:10 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
	Darren Hart <dvhart@...radead.org>,
	Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Valentin Schneider <vschneid@...hat.com>,
	Waiman Long <longman@...hat.com>
Subject: Re: [RFC v2 PATCH 0/4] futex: Add support task local hash maps.

On 2024-10-31 18:47:40 [+0100], To linux-kernel@...r.kernel.org wrote:
> On 2024-10-31 16:56:43 [+0100], To linux-kernel@...r.kernel.org wrote:

Since all of this can be scripted and I can have one kernel with …
so I hooked various hash algorithms to see where we get to.
240 threads, same box.
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
| buckets | jhash2 (regular) | jhash2 (addr+offs) |      xxhash |   hash_long |      crc32c |       crc32 |     siphash |    hsiphash |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
|       2 |          9,172.4 |            9,175.8 |     9,116.4 |     9,497.2 |     9,317.6 |     9,564.0 |     9,091.8 |     9,217.8 |
|       4 |         23,370.8 |           22,611.0 |    20,917.2 |    17,780.6 |    18,185.6 |    17,305.4 |    20,415.0 |    20,885.4 |
|       8 |         44,378.2 |           44,898.4 |    44,713.8 |    42,943.8 |    45,151.8 |    45,149.6 |    44,601.4 |    44,739.4 |
|      16 |         84,567.2 |           84,190.0 |    84,645.2 |    84,737.4 |    86,970.2 |    85,036.8 |    83,142.0 |    85,485.0 |
|      32 |        131,059.2 |          127,895.4 |   127,953.8 |   126,631.2 |   132,293.0 |   125,622.2 |   127,038.4 |   126,322.8 |
|      64 |        285,339.0 |          284,488.8 |   288,109.2 |   268,630.4 |   289,783.8 |   285,281.0 |   285,111.2 |   288,104.4 |
|     128 |        510,550.0 |          515,596.6 |   526,738.0 |   557,349.6 |   508,871.6 |   524,447.0 |   512,482.8 |   513,963.0 |
|     256 |      1,038,348.6 |        1,034,837.4 | 1,042,341.4 | 1,060,650.4 | 1,039,328.6 | 1,098,865.8 | 1,042,759.4 | 1,026,998.6 |
|     512 |      1,626,287.8 |        1,640,112.0 | 1,622,828.8 | 1,637,973.4 | 1,677,108.6 | 1,707,027.2 | 1,588,240.6 | 1,628,800.8 |
|    1024 |      1,827,878.6 |        1,849,074.4 | 1,836,483.8 | 1,776,366.4 | 1,884,670.8 | 1,842,734.2 | 1,765,815.0 | 1,822,137.8 |
|    2048 |      1,905,406.4 |        1,928,399.2 | 1,903,506.0 | 1,822,750.8 | 1,946,141.6 | 1,907,584.6 | 1,830,906.8 | 1,887,678.2 |
|    4096 |      1,912,522.6 |        1,929,667.4 | 1,907,121.6 | 1,847,231.6 | 1,949,908.0 | 1,927,728.6 | 1,834,648.0 | 1,893,792.2 |
|    8192 |      1,912,352.6 |        1,935,078.4 | 1,915,500.4 | 1,853,232.2 | 1,973,339.2 | 1,958,150.4 | 1,840,190.8 | 1,896,981.6 |
|   16384 |      1,917,836.8 |        1,941,917.0 | 1,910,106.0 | 1,863,751.4 | 1,955,101.4 | 1,947,673.2 | 1,836,488.2 | 1,898,002.0 |
|   32768 |      1,919,074.6 |        1,937,200.2 | 1,914,704.8 | 1,872,348.0 | 1,974,182.2 | 1,959,147.2 | 1,837,694.6 | 1,896,566.6 |
|   65536 |      1,930,988.0 |        1,959,076.0 | 1,926,927.6 | 1,873,267.6 | 1,914,420.8 | 1,951,292.4 | 1,849,658.6 | 1,910,334.6 |
|  131072 |      2,023,509.4 |        2,050,380.4 | 2,037,104.6 | 1,990,559.6 | 2,003,758.4 | 1,978,931.2 | 1,946,145.2 | 2,007,205.6 |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+

Intel(R) Xeon(R) CPU E7-8890 v3, 144 CPUs, 4 nodes.
Test using 140 threads, 0 buckets means global hash:
+---------+-------------+
| buckets |     ops/sec |
+---------+-------------+
|       0 | 2,644,742.8 |
|       2 |    21,750.2 |
|       4 |    37,537.2 |
|       8 |    69,950.6 |
|      16 |   127,722.0 |
|      32 |   225,479.2 |
|      64 |   401,335.6 |
|     128 |   753,714.8 |
|     256 | 1,376,116.0 |
|     512 | 2,008,764.2 |
|    1024 | 2,386,441.2 |
|    2048 | 2,564,764.0 |
|    4096 | 2,851,801.2 |
|    8192 | 2,862,999.6 |
|   16384 | 2,521,325.0 |
|   32768 | 2,421,839.2 |
|   65536 | 2,483,676.0 |
|  131072 | 2,733,504.2 |
+---------+-------------+

Binding the test to individual NUMA node, 34 threads:
+---------+-------------+-------------+-------------+-------------+
| buckets |      node 0 |      node 1 |      node 2 |      node 3 |
+---------+-------------+-------------+-------------+-------------+
|       0 | 4,149,878.4 | 4,149,079.8 | 4,148,085.2 | 4,149,420.6 |
|       2 |   194,714.4 |   197,382.8 |   191,967.0 |   193,510.6 |
|       4 |   363,778.6 |   360,700.2 |   364,293.6 |   361,830.2 |
|       8 |   681,770.4 |   673,973.0 |   658,601.6 |   662,212.0 |
|      16 | 1,201,256.4 | 1,177,681.0 | 1,195,749.4 | 1,181,200.2 |
|      32 | 2,002,673.2 | 1,989,139.0 | 1,988,264.4 | 1,981,004.8 |
|      64 | 2,963,416.0 | 2,962,292.0 | 2,957,491.6 | 2,964,479.6 |
|     128 | 3,499,580.0 | 3,495,971.2 | 3,495,537.6 | 3,499,902.8 |
|     256 | 3,713,251.2 | 3,711,806.4 | 3,716,935.4 | 3,715,458.2 |
|     512 | 3,800,606.4 | 3,801,960.4 | 3,813,903.4 | 3,809,076.6 |
|    1024 | 3,840,679.0 | 3,839,486.4 | 3,841,558.6 | 3,838,641.4 |
|    2048 | 3,867,732.8 | 3,866,216.2 | 3,858,603.4 | 3,848,031.6 |
|    4096 | 3,806,776.8 | 3,819,237.8 | 3,813,381.4 | 3,800,440.2 |
|    8192 | 3,815,358.4 | 3,806,204.2 | 3,804,171.2 | 3,795,476.2 |
|   16384 | 3,865,728.6 | 3,883,038.4 | 3,871,992.0 | 3,857,763.4 |
|   32768 | 4,017,227.0 | 4,025,249.8 | 4,022,779.4 | 4,009,740.8 |
|   65536 | 4,188,410.0 | 4,186,900.8 | 4,195,128.4 | 4,190,580.8 |
|  131072 | 4,334,937.0 | 4,335,978.8 | 4,327,250.2 | 4,332,567.8 |
+---------+-------------+-------------+-------------+-------------+

140 threads, all nodes for the algorithms test:
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
| buckets | jhash2 (regular) | jhash2 (addr+offs) |      xxhash |   hash_long |      crc32c |       crc32 |     siphash |    hsiphash |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
|       2 |         21,346.0 |           21,321.8 |    20,598.4 |    23,403.0 |    23,336.6 |    21,232.8 |    21,011.4 |    20,661.0 |
|       4 |         38,220.0 |           37,712.0 |    37,421.6 |    39,206.4 |    39,086.2 |    40,098.2 |    37,144.2 |    37,209.8 |
|       8 |         68,470.8 |           68,994.4 |    69,373.6 |    73,973.0 |    70,306.8 |    70,396.0 |    68,950.8 |    69,366.6 |
|      16 |        126,612.2 |          127,433.2 |   128,121.2 |   133,981.8 |   127,268.0 |   130,204.4 |   126,594.4 |   127,812.8 |
|      32 |        224,943.0 |          224,695.2 |   222,879.6 |   227,023.8 |   220,036.4 |   217,311.2 |   224,100.0 |   223,442.8 |
|      64 |        406,235.6 |          399,020.2 |   407,580.6 |   413,988.6 |   404,817.4 |   394,156.0 |   411,282.8 |   389,992.6 |
|     128 |        758,259.0 |          759,423.2 |   755,778.8 |   774,913.8 |   765,497.8 |   763,987.8 |   748,676.8 |   749,303.6 |
|     256 |      1,381,720.6 |        1,380,707.6 | 1,372,685.0 | 1,357,849.0 | 1,331,275.2 | 1,430,867.4 | 1,377,411.6 | 1,374,432.2 |
|     512 |      2,001,912.4 |        2,011,120.8 | 1,993,617.8 | 2,331,041.0 | 2,097,737.0 | 2,079,965.6 | 1,971,513.8 | 1,989,508.6 |
|    1024 |      2,378,279.6 |        2,412,139.6 | 2,371,655.4 | 2,650,416.8 | 2,477,507.8 | 2,456,023.8 | 2,309,010.4 | 2,353,854.2 |
|    2048 |      2,560,923.0 |        2,604,756.2 | 2,544,586.6 | 2,658,535.8 | 2,631,261.0 | 2,628,532.0 | 2,459,461.2 | 2,523,348.0 |
|    4096 |      2,855,199.2 |        2,942,364.8 | 2,822,369.8 | 2,998,159.4 | 2,936,124.2 | 2,919,140.6 | 2,694,488.8 | 2,794,201.4 |
|    8192 |      2,868,792.8 |        2,953,256.8 | 2,834,506.0 | 2,993,257.8 | 2,924,754.2 | 2,941,119.0 | 2,705,526.4 | 2,806,921.2 |
|   16384 |      2,527,784.0 |        2,595,100.2 | 2,498,789.8 | 2,610,646.8 | 2,540,535.4 | 2,550,376.0 | 2,398,098.4 | 2,475,184.4 |
|   32768 |      2,427,199.8 |        2,492,474.2 | 2,408,768.4 | 2,486,733.6 | 2,381,828.0 | 2,425,293.0 | 2,312,774.0 | 2,384,687.6 |
|   65536 |      2,489,441.8 |        2,554,741.4 | 2,465,692.0 | 2,666,031.8 | 2,419,651.8 | 2,515,099.8 | 2,368,451.8 | 2,438,185.6 |
|  131072 |      2,745,458.4 |        2,820,823.0 | 2,720,660.6 | 3,282,233.0 | 2,625,217.6 | 2,466,424.0 | 2,597,005.2 | 2,680,356.4 |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+

And now something smaller, Intel(R) Xeon(R) CPU E5-2650 0, 32CPUs in
total.
28 threads used for the test:
+---------+-------------+
| buckets |     ops/sec |
+---------+-------------+
|       0 | 2,344,905.8 |
|       2 |    91,881.2 |
|       4 |   168,243.0 |
|       8 |   310,982.2 |
|      16 |   550,534.4 |
|      32 |   884,066.0 |
|      64 | 1,475,389.4 |
|     128 | 1,949,364.6 |
|     256 | 2,142,025.8 |
|     512 | 2,234,222.2 |
|    1024 | 2,267,931.8 |
|    2048 | 2,287,753.4 |
|    4096 | 2,315,330.4 |
|    8192 | 2,337,878.2 |
|   16384 | 2,444,502.2 |
+---------+-------------+

14 Threads limited to a node:
+---------+-------------+-------------+
| buckets |      node 0 |      node 1 |
+---------+-------------+-------------+
|       0 | 2,761,709.8 | 2,765,630.0 |
|       2 |   397,527.8 |   397,126.8 |
|       4 |   718,205.0 |   719,615.2 |
|       8 | 1,350,627.4 | 1,305,201.4 |
|      16 | 1,992,643.4 | 1,989,499.2 |
|      32 | 2,365,813.6 | 2,357,618.6 |
|      64 | 2,554,185.8 | 2,555,256.8 |
|     128 | 2,646,479.0 | 2,654,572.6 |
|     256 | 2,679,394.4 | 2,698,002.4 |
|     512 | 2,713,385.6 | 2,723,413.6 |
|    1024 | 2,719,330.6 | 2,733,464.6 |
|    2048 | 2,730,376.6 | 2,738,581.6 |
|    4096 | 2,704,520.6 | 2,720,546.4 |
|    8192 | 2,773,213.4 | 2,782,565.6 |
|   16384 | 2,863,843.2 | 2,858,963.2 |
+---------+-------------+-------------+

And now algorithms, 28 Threads.
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
| buckets | jhash2 (regular) | jhash2 (addr+offs) |      xxhash |   hash_long |      crc32c |       crc32 |     siphash |    hsiphash |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
|       2 |         92,557.8 |           92,815.2 |    93,172.2 |   103,097.6 |    97,403.2 |    92,629.6 |    94,030.8 |    91,847.2 |
|       4 |        165,385.2 |          167,200.0 |   168,681.2 |   177,600.2 |   172,851.2 |   173,423.6 |   167,814.6 |   168,136.8 |
|       8 |        319,044.0 |          317,291.6 |   318,322.0 |   342,179.4 |   318,252.6 |   323,456.6 |   319,079.6 |   317,106.2 |
|      16 |        555,103.6 |          556,075.0 |   563,529.0 |   595,052.8 |   537,199.2 |   557,180.4 |   554,498.8 |   550,170.4 |
|      32 |        896,751.8 |          908,569.4 |   908,687.4 |   852,593.2 |   892,222.6 |   919,105.0 |   874,487.8 |   920,554.6 |
|      64 |      1,488,013.0 |        1,500,952.6 | 1,467,258.8 | 1,528,428.2 | 1,530,458.6 | 1,526,439.6 | 1,459,185.2 | 1,480,434.0 |
|     128 |      1,944,216.0 |        1,974,618.6 | 1,927,277.6 | 1,748,598.4 | 1,989,212.0 | 1,975,526.2 | 1,839,080.4 | 1,903,844.4 |
|     256 |      2,142,823.0 |        2,185,436.6 | 2,126,787.8 | 2,194,752.2 | 2,189,521.2 | 2,164,454.2 | 1,987,121.0 | 2,081,487.0 |
|     512 |      2,232,887.4 |        2,279,553.4 | 2,215,265.8 | 2,274,402.6 | 2,278,595.6 | 2,262,156.4 | 2,047,572.8 | 2,169,430.8 |
|    1024 |      2,269,308.2 |        2,312,200.0 | 2,250,841.0 | 2,278,423.2 | 2,328,832.6 | 2,288,490.0 | 2,075,494.2 | 2,190,907.8 |
|    2048 |      2,281,539.0 |        2,336,340.6 | 2,255,446.8 | 2,221,195.4 | 2,374,069.2 | 2,330,833.2 | 2,083,151.4 | 2,196,610.0 |
|    4096 |      2,315,628.8 |        2,367,224.6 | 2,284,841.4 | 2,397,385.8 | 2,373,043.2 | 2,394,276.6 | 2,104,235.0 | 2,233,600.0 |
|    8192 |      2,341,296.8 |        2,401,307.8 | 2,320,777.4 | 2,336,329.0 | 2,331,216.4 | 2,391,361.6 | 2,122,129.4 | 2,250,452.6 |
|   16384 |      2,435,181.6 |        2,509,588.2 | 2,422,407.8 | 2,378,702.8 | 2,514,325.8 | 2,552,565.8 | 2,200,619.0 | 2,350,706.0 |
+---------+------------------+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ