2018-01-28 08:57:19 mpirun.openmpi --allow-run-as-root -np 88 hpcc ######################################################################## This is the DARPA/DOE HPC Challenge Benchmark version 1.4.2 October 2012 Produced by Jack Dongarra and Piotr Luszczek Innovative Computing Laboratory University of Tennessee Knoxville and Oak Ridge National Laboratory See the source files for authors of specific codes. Compiled on Apr 27 2015 at 11:04:18 Current time (1517101040) is Sun Jan 28 08:57:20 2018 Hostname: 'lkp-bdw-ep6' ######################################################################## ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 1000 NB : 80 PMAP : Row-major process mapping P : 2 Q : 2 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 Begin of MPIRandomAccess section. Running on 88 processors Total Main table size = 2^24 = 16777216 words PE Main table size = (2^24)/88 = 190651 words/PE MAX Default number of updates (RECOMMENDED) = 67108864 Number of updates EXECUTED = 67108864 (for a TIME BOUND of 60.00 secs) CPU time used = 0.765464 seconds Real time used = 1.345382 seconds 0.049880904 Billion(10^9) Updates per second [GUP/s] 0.000566828 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.140699 seconds Verification: Real time used = 0.193454 seconds Found 0 errors in 16777216 locations (passed). Current time (1517101041) is Sun Jan 28 08:57:21 2018 End of MPIRandomAccess section. Begin of StarRandomAccess section. Main table size = 2^17 = 131072 words Number of updates = 524288 CPU time used = 0.005035 seconds Real time used = 0.005043 seconds 0.103965617 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 131072 locations (passed). Node(s) with error 0 Minimum GUP/s 0.100912 Average GUP/s 0.106046 Maximum GUP/s 0.204813 Current time (1517101041) is Sun Jan 28 08:57:21 2018 End of StarRandomAccess section. Begin of SingleRandomAccess section. Node(s) with error 0 Node selected 84 Single GUP/s 0.194759 Current time (1517101041) is Sun Jan 28 08:57:21 2018 End of SingleRandomAccess section. Begin of MPIRandomAccess_LCG section. Running on 88 processors Total Main table size = 2^24 = 16777216 words PE Main table size = (2^24)/88 = 190651 words/PE MAX Default number of updates (RECOMMENDED) = 67108864 Number of updates EXECUTED = 67108864 (for a TIME BOUND of 60.00 secs) CPU time used = 0.678260 seconds Real time used = 1.141630 seconds 0.058783358 Billion(10^9) Updates per second [GUP/s] 0.000667993 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.079081 seconds Verification: Real time used = 0.085036 seconds Found 0 errors in 16777216 locations (passed). Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of MPIRandomAccess_LCG section. Begin of StarRandomAccess_LCG section. Main table size = 2^17 = 131072 words Number of updates = 524288 CPU time used = 0.005839 seconds Real time used = 0.005840 seconds 0.089776542 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 131072 locations (passed). Node(s) with error 0 Minimum GUP/s 0.081472 Average GUP/s 0.091158 Maximum GUP/s 0.110772 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of StarRandomAccess_LCG section. Begin of SingleRandomAccess_LCG section. Node(s) with error 0 Node selected 15 Single GUP/s 0.225912 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of SingleRandomAccess_LCG section. Begin of PTRANS section. M: 500 N: 500 MB: 80 NB: 80 P: 2 Q: 2 TIME M N MB NB P Q TIME CHECK GB/s RESID ---- ----- ----- --- --- --- --- -------- ------ -------- ----- WALL 500 500 80 80 2 2 0.00 PASSED 2.398 0.00 CPU 500 500 80 80 2 2 0.00 PASSED 2.407 0.00 WALL 500 500 80 80 2 2 0.00 PASSED 2.398 0.00 CPU 500 500 80 80 2 2 0.00 PASSED 3.231 0.00 WALL 500 500 80 80 2 2 0.00 PASSED 2.398 0.00 CPU 500 500 80 80 2 2 0.00 PASSED 3.273 0.00 WALL 500 500 80 80 2 2 0.00 PASSED 2.398 0.00 CPU 500 500 80 80 2 2 0.00 PASSED 3.247 0.00 WALL 500 500 80 80 2 2 0.00 PASSED 2.398 0.00 CPU 500 500 80 80 2 2 0.00 PASSED 3.210 0.00 Finished 5 tests, with the following results: 5 tests completed and passed residual checks. 0 tests completed and failed residual checks. 0 tests skipped because of illegal input values. END OF TESTS. Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of PTRANS section. Begin of StarDGEMM section. Scaled residual: 0.0245299 Node(s) with error 0 Minimum Gflop/s 4.084233 Average Gflop/s 4.441628 Maximum Gflop/s 5.139282 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of StarDGEMM section. Begin of SingleDGEMM section. Node(s) with error 0 Node selected 15 Single DGEMM Gflop/s 6.449176 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of SingleDGEMM section. Begin of StarSTREAM section. ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 83333, Offset = 0 Total memory required = 0.0019 GiB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 787 microseconds. (= 787 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (GB/s) Avg time Min time Max time Copy: 1.4742 0.0010 0.0009 0.0011 Scale: 1.2655 0.0011 0.0011 0.0012 Add: 1.2963 0.0016 0.0015 0.0017 Triad: 1.9337 0.0014 0.0010 0.0015 ------------------------------------------------------------- Results Comparison: Expected : 96108014003906256.000000 19221602800781248.000000 25628803734375000.000000 Observed : 96108014003770832.000000 19221602800770784.000000 25628803734375000.000000 Solution Validates ------------------------------------------------------------- Node(s) with error 0 Minimum Copy GB/s 1.188474 Average Copy GB/s 1.554185 Maximum Copy GB/s 1.857919 Minimum Scale GB/s 1.182443 Average Scale GB/s 1.542264 Maximum Scale GB/s 2.019594 Minimum Add GB/s 1.221413 Average Add GB/s 1.357279 Maximum Add GB/s 1.801379 Minimum Triad GB/s 1.190427 Average Triad GB/s 1.396629 Maximum Triad GB/s 2.452287 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of StarSTREAM section. Begin of SingleSTREAM section. Node(s) with error 0 Node selected 15 Single STREAM Copy GB/s 19.879945 Single STREAM Scale GB/s 18.976516 Single STREAM Add GB/s 23.389804 Single STREAM Triad GB/s 22.202151 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of SingleSTREAM section. Begin of MPIFFT section. Number of nodes: 64 Vector size: 1048576 Generation time: 0.001 Tuning: 0.001 Computing: 0.003 Inverse FFT: 0.003 max(|x-x0|): 1.421e-15 Gflop/s: 32.881 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of MPIFFT section. Begin of StarFFT section. Vector size: 32768 Generation time: 0.001 Tuning: 0.000 Computing: 0.002 Inverse FFT: 0.002 max(|x-x0|): 1.226e-15 Node(s) with error 0 Minimum Gflop/s 1.003556 Average Gflop/s 1.509217 Maximum Gflop/s 1.629109 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of StarFFT section. Begin of SingleFFT section. Node(s) with error 0 Node selected 15 Single FFT Gflop/s 1.915509 Current time (1517101043) is Sun Jan 28 08:57:23 2018 End of SingleFFT section. Begin of LatencyBandwidth section. ------------------------------------------------------------------ Latency-Bandwidth-Benchmark R1.5.1 (c) HLRS, University of Stuttgart Written by Rolf Rabenseifner, Gerrit Schulz, and Michael Speck, Germany Details - level 2 ----------------- MPI_Wtime granularity. Max. MPI_Wtick is 0.000000 sec wtick is set to 0.000001 sec Message Length: 8 Latency min / avg / max: 0.001083 / 0.001083 / 0.001083 msecs Bandwidth min / avg / max: 7.386 / 7.386 / 7.386 MByte/s MPI_Wtime granularity is ok. message size: 8 max time : 10.000000 secs latency for msg: 0.001083 msecs estimation for ping pong: 0.097476 msecs max number of ping pong pairs = 102589 max client pings = max server pongs = 320 stride for latency = 1 Message Length: 8 Latency min / avg / max: 0.000471 / 0.000927 / 0.001204 msecs Bandwidth min / avg / max: 6.647 / 8.865 / 17.003 MByte/s Message Length: 2000000 Latency min / avg / max: 0.476711 / 0.476711 / 0.476711 msecs Bandwidth min / avg / max: 4195.410 / 4195.410 / 4195.410 MByte/s MPI_Wtime granularity is ok. message size: 2000000 max time : 30.000000 secs latency for msg: 0.476711 msecs estimation for ping pong: 3.813692 msecs max number of ping pong pairs = 7866 max client pings = max server pongs = 88 stride for latency = 1 Message Length: 2000000 Latency min / avg / max: 0.224100 / 0.360378 / 0.504365 msecs Bandwidth min / avg / max: 3965.378 / 6274.838 / 8924.587 MByte/s Message Size: 8 Byte Natural Order Latency: 0.001285 msec Natural Order Bandwidth: 6.225196 MB/s Avg Random Order Latency: 0.001563 msec Avg Random Order Bandwidth: 5.118673 MB/s Message Size: 2000000 Byte Natural Order Latency: 7.525214 msec Natural Order Bandwidth: 265.773173 MB/s Avg Random Order Latency: 7.261226 msec Avg Random Order Bandwidth: 275.435580 MB/s Execution time (wall clock) = 30.162 sec on 88 processes - for cross ping_pong latency = 1.004 sec - for cross ping_pong bandwidth = 26.151 sec - for ring latency = 0.019 sec - for ring bandwidth = 2.988 sec ------------------------------------------------------------------ Latency-Bandwidth-Benchmark R1.5.1 (c) HLRS, University of Stuttgart Written by Rolf Rabenseifner, Gerrit Schulz, and Michael Speck, Germany Major Benchmark results: ------------------------ Max Ping Pong Latency: 0.001204 msecs Randomly Ordered Ring Latency: 0.001563 msecs Min Ping Pong Bandwidth: 3965.378282 MB/s Naturally Ordered Ring Bandwidth: 265.773173 MB/s Randomly Ordered Ring Bandwidth: 275.435580 MB/s ------------------------------------------------------------------ Detailed benchmark results: Ping Pong: Latency min / avg / max: 0.000471 / 0.000927 / 0.001204 msecs Bandwidth min / avg / max: 3965.378 / 6274.838 / 8924.587 MByte/s Ring: On naturally ordered ring: latency= 0.001285 msec, bandwidth= 265.773173 MB/s On randomly ordered ring: latency= 0.001563 msec, bandwidth= 275.435580 MB/s ------------------------------------------------------------------ Benchmark conditions: The latency measurements were done with 8 bytes The bandwidth measurements were done with 2000000 bytes The ring communication was done in both directions on 88 processes The Ping Pong measurements were done on - 7656 pairs of processes for latency benchmarking, and - 7656 pairs of processes for bandwidth benchmarking, out of 88*(88-1) = 7656 possible combinations on 88 processes. (1 MB/s = 10**6 byte/sec) ------------------------------------------------------------------ Current time (1517101073) is Sun Jan 28 08:57:53 2018 End of LatencyBandwidth section. Begin of HPL section. ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 1000 NB : 80 PMAP : Row-major process mapping P : 2 Q : 2 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ================================================================================ T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WR11C2R4 1000 80 2 2 0.03 2.102e+01 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0049770 ...... PASSED ================================================================================ Finished 1 tests with the following results: 1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. -------------------------------------------------------------------------------- End of Tests. ================================================================================ Current time (1517101073) is Sun Jan 28 08:57:53 2018 End of HPL section. Begin of Summary section. VersionMajor=1 VersionMinor=4 VersionMicro=2 VersionRelease=f LANG=C Success=1 sizeof_char=1 sizeof_short=2 sizeof_int=4 sizeof_long=8 sizeof_void_ptr=8 sizeof_size_t=8 sizeof_float=4 sizeof_double=8 sizeof_s64Int=8 sizeof_u64Int=8 sizeof_struct_double_double=16 CommWorldProcs=88 MPI_Wtick=1.000000e-09 HPL_Tflops=0.0210153 HPL_time=0.0317943 HPL_eps=1.11022e-16 HPL_RnormI=1.64846e-12 HPL_Anorm1=263.865 HPL_AnormI=262.773 HPL_Xnorm1=2619.63 HPL_XnormI=11.3513 HPL_BnormI=0.499776 HPL_N=1000 HPL_NB=80 HPL_nprow=2 HPL_npcol=2 HPL_depth=1 HPL_nbdiv=2 HPL_nbmin=4 HPL_cpfact=R HPL_crfact=C HPL_ctop=1 HPL_order=R HPL_dMACH_EPS=1.110223e-16 HPL_dMACH_SFMIN=2.225074e-308 HPL_dMACH_BASE=2.000000e+00 HPL_dMACH_PREC=2.220446e-16 HPL_dMACH_MLEN=5.300000e+01 HPL_dMACH_RND=1.000000e+00 HPL_dMACH_EMIN=-1.021000e+03 HPL_dMACH_RMIN=2.225074e-308 HPL_dMACH_EMAX=1.024000e+03 HPL_dMACH_RMAX=1.797693e+308 HPL_sMACH_EPS=5.960464e-08 HPL_sMACH_SFMIN=1.175494e-38 HPL_sMACH_BASE=2.000000e+00 HPL_sMACH_PREC=1.192093e-07 HPL_sMACH_MLEN=2.400000e+01 HPL_sMACH_RND=1.000000e+00 HPL_sMACH_EMIN=-1.250000e+02 HPL_sMACH_RMIN=1.175494e-38 HPL_sMACH_EMAX=1.280000e+02 HPL_sMACH_RMAX=3.402823e+38 dweps=1.110223e-16 sweps=5.960464e-08 HPLMaxProcs=4 HPLMinProcs=4 DGEMM_N=288 StarDGEMM_Gflops=4.44163 SingleDGEMM_Gflops=6.44918 PTRANS_GBs=2.39843 PTRANS_time=0.00062525 PTRANS_residual=1 PTRANS_n=500 PTRANS_nb=80 PTRANS_nprow=2 PTRANS_npcol=2 MPIRandomAccess_LCG_N=16777216 MPIRandomAccess_LCG_time=1.14163 MPIRandomAccess_LCG_CheckTime=0.0850355 MPIRandomAccess_LCG_Errors=0 MPIRandomAccess_LCG_ErrorsFraction=0 MPIRandomAccess_LCG_ExeUpdates=67108864 MPIRandomAccess_LCG_GUPs=0.0587834 MPIRandomAccess_LCG_TimeBound=60 MPIRandomAccess_LCG_Algorithm=0 MPIRandomAccess_N=16777216 MPIRandomAccess_time=1.34538 MPIRandomAccess_CheckTime=0.193454 MPIRandomAccess_Errors=0 MPIRandomAccess_ErrorsFraction=0 MPIRandomAccess_ExeUpdates=67108864 MPIRandomAccess_GUPs=0.0498809 MPIRandomAccess_TimeBound=60 MPIRandomAccess_Algorithm=0 RandomAccess_LCG_N=131072 StarRandomAccess_LCG_GUPs=0.0911576 SingleRandomAccess_LCG_GUPs=0.225912 RandomAccess_N=131072 StarRandomAccess_GUPs=0.106046 SingleRandomAccess_GUPs=0.194759 STREAM_VectorSize=83333 STREAM_Threads=1 StarSTREAM_Copy=1.55419 StarSTREAM_Scale=1.54226 StarSTREAM_Add=1.35728 StarSTREAM_Triad=1.39663 SingleSTREAM_Copy=19.8799 SingleSTREAM_Scale=18.9765 SingleSTREAM_Add=23.3898 SingleSTREAM_Triad=22.2022 FFT_N=32768 StarFFT_Gflops=1.50922 SingleFFT_Gflops=1.91551 MPIFFT_N=1048576 MPIFFT_Gflops=32.8808 MPIFFT_maxErr=1.42104e-15 MPIFFT_Procs=64 MaxPingPongLatency_usec=1.20362 RandomlyOrderedRingLatency_usec=1.56291 MinPingPongBandwidth_GBytes=3.96538 NaturallyOrderedRingBandwidth_GBytes=0.265773 RandomlyOrderedRingBandwidth_GBytes=0.275436 MinPingPongLatency_usec=0.4705 AvgPingPongLatency_usec=0.926591 MaxPingPongBandwidth_GBytes=8.92459 AvgPingPongBandwidth_GBytes=6.27484 NaturallyOrderedRingLatency_usec=1.2851 FFTEnblk=16 FFTEnp=8 FFTEl2size=1048576 M_OPENMP=-1 omp_get_num_threads=0 omp_get_max_threads=0 omp_get_num_procs=0 MemProc=-1 MemSpec=-1 MemVal=-1 MPIFFT_time0=3.33e-07 MPIFFT_time1=0.000889437 MPIFFT_time2=0.000353926 MPIFFT_time3=0.000598774 MPIFFT_time4=0.000624967 MPIFFT_time5=0.000687074 MPIFFT_time6=1.1e-07 CPS_HPCC_FFT_235=0 CPS_HPCC_FFTW_ESTIMATE=0 CPS_HPCC_MEMALLCTR=0 CPS_HPL_USE_GETPROCESSTIMES=0 CPS_RA_SANDIA_NOPT=0 CPS_RA_SANDIA_OPT2=0 CPS_USING_FFTW=0 End of Summary section. ######################################################################## End of HPC Challenge tests. Current time (1517101073) is Sun Jan 28 08:57:53 2018 ########################################################################