| 311 | |
| 312 | |
| 313 | == Obtaining statistics == |
| 314 | |
| 315 | The visual execution of tasks can be further complemented with statistics. Run the following command: |
| 316 | |
| 317 | `NANOS6=stats taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1` |
| 318 | |
| 319 | will give you the information below: |
| 320 | {{{ |
| 321 | $ NANOS6=stats taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1 |
| 322 | size: 196608, bs: 8192, iterations: 1, time: 0.000241, performance: 0.815801 |
| 323 | STATS Total CPUs 12 |
| 324 | STATS Total time 2.42573e+07 ns |
| 325 | STATS Total threads 12 |
| 326 | STATS Mean threads per CPU 1 |
| 327 | STATS Mean tasks per thread 2.08333 |
| 328 | |
| 329 | STATS Mean thread lifetime 3.65355e+09 % |
| 330 | STATS Mean thread running time 100 % |
| 331 | STATS Mean effective parallelism 0.123268 |
| 332 | |
| 333 | STATS All Tasks instances 25 |
| 334 | STATS All Tasks mean instantiation time 1445 ns 0.885064 % |
| 335 | STATS All Tasks mean pending time 0 ns 0 % |
| 336 | STATS All Tasks mean ready time 32446 ns 19.8732 % |
| 337 | STATS All Tasks mean execution time 119605 ns 73.2582 % |
| 338 | STATS All Tasks mean blocked time 3702 ns 2.26748 % |
| 339 | STATS All Tasks mean zombie time 6067 ns 3.71604 % |
| 340 | STATS All Tasks mean lifetime 163265 ns |
| 341 | |
| 342 | STATS 03.multisaxpy_task.cpp:3:13 instances 24 |
| 343 | STATS 03.multisaxpy_task.cpp:3:13 mean instantiation time 1251 ns 1.75051 % |
| 344 | STATS 03.multisaxpy_task.cpp:3:13 mean pending time 0 ns 0 % |
| 345 | STATS 03.multisaxpy_task.cpp:3:13 mean ready time 32944 ns 46.0981 % |
| 346 | STATS 03.multisaxpy_task.cpp:3:13 mean execution time 31079 ns 43.4884 % |
| 347 | STATS 03.multisaxpy_task.cpp:3:13 mean blocked time 0 ns 0 % |
| 348 | STATS 03.multisaxpy_task.cpp:3:13 mean zombie time 6191 ns 8.66298 % |
| 349 | STATS 03.multisaxpy_task.cpp:3:13 mean lifetime 71465 ns |
| 350 | |
| 351 | STATS main instances 1 |
| 352 | STATS main mean instantiation time 6089 ns 0.2573 % |
| 353 | STATS main mean pending time 0 ns 0 % |
| 354 | STATS main mean ready time 20505 ns 0.866471 % |
| 355 | STATS main mean execution time 2244241 ns 94.8339 % |
| 356 | STATS main mean blocked time 92553 ns 3.91097 % |
| 357 | STATS main mean zombie time 3108 ns 0.131333 % |
| 358 | STATS main mean lifetime 2366496 ns |
| 359 | |
| 360 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 instances 24 |
| 361 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean instantiation time 1251 ns 1.75051 % |
| 362 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean pending time 0 ns 0 % |
| 363 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean ready time 32944 ns 46.0981 % |
| 364 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean execution time 31079 ns 43.4884 % |
| 365 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean blocked time 0 ns 0 % |
| 366 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean zombie time 6191 ns 8.66298 % |
| 367 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean lifetime 71465 ns |
| 368 | |
| 369 | STATS Phase 1 instances 24 |
| 370 | STATS Phase 1 mean instantiation time 1251 ns 1.75051 % |
| 371 | STATS Phase 1 mean pending time 0 ns 0 % |
| 372 | STATS Phase 1 mean ready time 32944 ns 46.0981 % |
| 373 | STATS Phase 1 mean execution time 31079 ns 43.4884 % |
| 374 | STATS Phase 1 mean blocked time 0 ns 0 % |
| 375 | STATS Phase 1 mean zombie time 6191 ns 8.66298 % |
| 376 | STATS Phase 1 mean lifetime 71465 ns |
| 377 | STATS Phase 1 effective parallelism 0.165278 |
| 378 | }}} |
| 379 | |
| 380 | Additionally, you can get information related to hardware counters (PAPI). For this, first load the PAPI module: |
| 381 | |
| 382 | `module load PAPI/5.6.0` |
| 383 | |
| 384 | and execute: |
| 385 | |
| 386 | `NANOS6=stats-papi taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1` |
| 387 | |
| 388 | to get the following information: |
| 389 | {{{ |
| 390 | $ NANOS6=stats-papi taskset -c 12-23 ./03.multisaxpy_task 196608 8192 1 |
| 391 | size: 196608, bs: 8192, iterations: 1, time: 0.000236, performance: 0.833085 |
| 392 | STATS Total CPUs 12 |
| 393 | STATS Total time 3.06985e+07 ns |
| 394 | STATS Total threads 12 |
| 395 | STATS Mean threads per CPU 1 |
| 396 | STATS Mean tasks per thread 2.08333 |
| 397 | |
| 398 | STATS Mean thread lifetime 2.88807e+09 % |
| 399 | STATS Mean thread running time 100 % |
| 400 | STATS Mean effective parallelism 0.13271 |
| 401 | |
| 402 | STATS All Tasks instances 25 |
| 403 | STATS All Tasks mean instantiation time 2708 ns 1.52238 % |
| 404 | STATS All Tasks mean pending time 0 ns 0 % |
| 405 | STATS All Tasks mean ready time 9032 ns 5.07761 % |
| 406 | STATS All Tasks mean execution time 162959 ns 91.6123 % |
| 407 | STATS All Tasks mean blocked time 1105 ns 0.621209 % |
| 408 | STATS All Tasks mean zombie time 2075 ns 1.16652 % |
| 409 | STATS All Tasks mean lifetime 177879 ns |
| 410 | STATS All Tasks Real frequency 0.658047 GHz |
| 411 | STATS All Tasks Virtual frequency 0.782649 GHz |
| 412 | STATS All Tasks IPC 1.66625 |
| 413 | STATS All Tasks L2 data cache miss ratio 3.203 |
| 414 | STATS All Tasks Real nsecs 3804026 nsecs |
| 415 | STATS All Tasks Virtual nsecs 3198406 nsecs |
| 416 | STATS All Tasks Instructions 4171011 instructions |
| 417 | STATS All Tasks Total cycles 2503229 |
| 418 | STATS All Tasks Instr completed 4171011 |
| 419 | STATS All Tasks L2D cache accesses 16754 |
| 420 | STATS All Tasks L2D cache misses 53663 |
| 421 | STATS All Tasks Reference cycles 2054784 |
| 422 | |
| 423 | STATS 03.multisaxpy_task.cpp:3:13 instances 24 |
| 424 | STATS 03.multisaxpy_task.cpp:3:13 mean instantiation time 2498 ns 4.60435 % |
| 425 | STATS 03.multisaxpy_task.cpp:3:13 mean pending time 0 ns 0 % |
| 426 | STATS 03.multisaxpy_task.cpp:3:13 mean ready time 8237 ns 15.1826 % |
| 427 | STATS 03.multisaxpy_task.cpp:3:13 mean execution time 41452 ns 76.405 % |
| 428 | STATS 03.multisaxpy_task.cpp:3:13 mean blocked time 0 ns 0 % |
| 429 | STATS 03.multisaxpy_task.cpp:3:13 mean zombie time 2066 ns 3.80808 % |
| 430 | STATS 03.multisaxpy_task.cpp:3:13 mean lifetime 54253 ns |
| 431 | STATS 03.multisaxpy_task.cpp:3:13 Real frequency 3.16748 GHz |
| 432 | STATS 03.multisaxpy_task.cpp:3:13 Virtual frequency 3.18873 GHz |
| 433 | STATS 03.multisaxpy_task.cpp:3:13 IPC 1.72954 |
| 434 | STATS 03.multisaxpy_task.cpp:3:13 L2 data cache miss ratio 3.96831 |
| 435 | STATS 03.multisaxpy_task.cpp:3:13 Real nsecs 755566 nsecs |
| 436 | STATS 03.multisaxpy_task.cpp:3:13 Virtual nsecs 750532 nsecs |
| 437 | STATS 03.multisaxpy_task.cpp:3:13 Instructions 4139211 instructions |
| 438 | STATS 03.multisaxpy_task.cpp:3:13 Total cycles 2393243 |
| 439 | STATS 03.multisaxpy_task.cpp:3:13 Instr completed 4139211 |
| 440 | STATS 03.multisaxpy_task.cpp:3:13 L2D cache accesses 13316 |
| 441 | STATS 03.multisaxpy_task.cpp:3:13 L2D cache misses 52842 |
| 442 | STATS 03.multisaxpy_task.cpp:3:13 Reference cycles 1964416 |
| 443 | |
| 444 | STATS main instances 1 |
| 445 | STATS main mean instantiation time 7755 ns 0.246588 % |
| 446 | STATS main mean pending time 0 ns 0 % |
| 447 | STATS main mean ready time 28131 ns 0.894488 % |
| 448 | STATS main mean execution time 3079121 ns 97.9076 % |
| 449 | STATS main mean blocked time 27636 ns 0.878749 % |
| 450 | STATS main mean zombie time 2284 ns 0.0726249 % |
| 451 | STATS main mean lifetime 3144927 ns |
| 452 | STATS main Real frequency 0.0360792 GHz |
| 453 | STATS main Virtual frequency 0.0449312 GHz |
| 454 | STATS main IPC 0.289128 |
| 455 | STATS main L2 data cache miss ratio 0.238802 |
| 456 | STATS main Real nsecs 3048460 nsecs |
| 457 | STATS main Virtual nsecs 2447874 nsecs |
| 458 | STATS main Instructions 31800 instructions |
| 459 | STATS main Total cycles 109986 |
| 460 | STATS main Instr completed 31800 |
| 461 | STATS main L2D cache accesses 3438 |
| 462 | STATS main L2D cache misses 821 |
| 463 | STATS main Reference cycles 90368 |
| 464 | |
| 465 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 instances 24 |
| 466 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean instantiation time 2498 ns 4.60435 % |
| 467 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean pending time 0 ns 0 % |
| 468 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean ready time 8237 ns 15.1826 % |
| 469 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean execution time 41452 ns 76.405 % |
| 470 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean blocked time 0 ns 0 % |
| 471 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean zombie time 2066 ns 3.80808 % |
| 472 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 mean lifetime 54253 ns |
| 473 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Real frequency 3.16748 GHz |
| 474 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Virtual frequency 3.18873 GHz |
| 475 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 IPC 1.72954 |
| 476 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 L2 data cache miss ratio 3.96831 |
| 477 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Real nsecs 755566 nsecs |
| 478 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Virtual nsecs 750532 nsecs |
| 479 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Instructions 4139211 instructions |
| 480 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Total cycles 2393243 |
| 481 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Instr completed 4139211 |
| 482 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 L2D cache accesses 13316 |
| 483 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 L2D cache misses 52842 |
| 484 | STATS Phase 1 03.multisaxpy_task.cpp:3:13 Reference cycles 1964416 |
| 485 | |
| 486 | STATS Phase 1 instances 24 |
| 487 | STATS Phase 1 mean instantiation time 2498 ns 4.60435 % |
| 488 | STATS Phase 1 mean pending time 0 ns 0 % |
| 489 | STATS Phase 1 mean ready time 8237 ns 15.1826 % |
| 490 | STATS Phase 1 mean execution time 41452 ns 76.405 % |
| 491 | STATS Phase 1 mean blocked time 0 ns 0 % |
| 492 | STATS Phase 1 mean zombie time 2066 ns 3.80808 % |
| 493 | STATS Phase 1 mean lifetime 54253 ns |
| 494 | STATS Phase 1 Real frequency 3.16748 GHz |
| 495 | STATS Phase 1 Virtual frequency 3.18873 GHz |
| 496 | STATS Phase 1 IPC 1.72954 |
| 497 | STATS Phase 1 L2 data cache miss ratio 3.96831 |
| 498 | STATS Phase 1 Real nsecs 755566 nsecs |
| 499 | STATS Phase 1 Virtual nsecs 750532 nsecs |
| 500 | STATS Phase 1 Instructions 4139211 instructions |
| 501 | STATS Phase 1 Total cycles 2393243 |
| 502 | STATS Phase 1 Instr completed 4139211 |
| 503 | STATS Phase 1 L2D cache accesses 13316 |
| 504 | STATS Phase 1 L2D cache misses 52842 |
| 505 | STATS Phase 1 Reference cycles 1964416 |
| 506 | STATS Phase 1 effective parallelism 0.217033 |
| 507 | }}} |
| 508 | |
| 509 | == Tracing with Extrae == |
| 510 | |
| 511 | |
| 512 | |