| 27 | \\ |
| 28 | \\ |
| 29 | |
| 30 | = 5 things to consider when using KNL = |
| 31 | 1. Make sure to use the fast MCDRAM: |
| 32 | * When MCDRAM is in cache mode: |
| 33 | * No changes are needed. |
| 34 | * When MCDRAM is in flat mode: |
| 35 | * If the total memory footprint of the application is smaller than the size of MCDRAM: numactl –m 1 ./my_application.out (Allocations that don’t fit into MCDRAM make the application fail.) |
| 36 | * If the total memory footprint of the application is larger than the size of MCDRAM: numactl –p 1 ./my_application.out ( Allocations that don’t fit into MCDRAM spill over to DDR) |
| 37 | * To make a manual choice of what should be allocated in the MCDRAM: Use the memkind library.\\ |
| 38 | |
| 39 | 2. Verify that the pinning is as you wish: |
| 40 | * Start job on KNL node(s). |
| 41 | * Log in on KNL. |
| 42 | * Invoke htop. |
| 43 | * Check the load distribution. |
| 44 | * Remark: Each core can execute 1, 2 or 4 threads. On KNL – unlike on KNC – already one thread per core can lead to optimal performance.\\ |
| 45 | |
| 46 | 3. Use VTune/Advisor to analyse the performance: |
| 47 | * Start job on KNL node(s). |
| 48 | * Log in on KNL. |
| 49 | * 'module load VTune / Advisor'. |
| 50 | * Run amplxe-gui / advixe-gui. |
| 51 | * Follow instructions. |
| 52 | * Remark: If you run into erros of the sort “sepdk not available” please contact the administrator. Both tools rely on a kernel module to access hardware counter.\\ |
| 53 | |
| 54 | 4. Provide hints to the compiler: |
| 55 | * Check *optrpt for info on vectorisation. |
| 56 | * If you find “unaligned...” -> add alignment in your code by adding "#pragma vector aligned" before the loop. |
| 57 | * If a loop does not vectorise although it clearly should, you can add "#pragma simd" before the loop. |
| 58 | * Re-check *.optrpt. |
| 59 | * Re-check in VTune / Advisor\\ |
| 60 | |
| 61 | 5. Verify the performance via benchmarks: |
| 62 | * Set up JUBE for your code. |
| 63 | * Benchmark the various versions with proper timing. |
| 64 | * Be aware: VTune / Advisor sometimes give estimates that are a little off. It's imperative to check the actual performance. |
| 65 | |