SNU/SNU DCSLAB(2022~2024)

[HPC] Core Pinning이란

아나엘 2023. 8. 24. 17:44

What is Core Pinning? (CPU affinity)

https://www.nas.nasa.gov/hecc/support/kb/processthread-pinning-overview_259.html

pinning -> the binding of a process or thread to a specific core, can improve the performance of your code by increasing the percentage of local memory accesses. 멀티코어 환경에서 SW 성능 높이기 위해, 프로세스를 처리할수있는 CPU종류를 설정하는 것. 

 

Once your code runs and produces correct results on a system, the next step is performance improvement.

(실제로 DB에 따른 오버헤드 측정 실험 하는데 코드가 돌아가는 건 확인한 단계에서, 코어 피닝을 하라고 말씀하셨음)For a code that uses multiple cores, the placement of processes and/or threads can play a significant role in performance.

 

Given a set of processor cores in a PBS jobs, the Linux kernel usually does a reasonably good job of mapping processes/threads to physical cores, although the kernel may also migrate processes/threads. Some OpenMP runtime libraries and MPI libs may also perform certain placements by default. In cases where the placements by the kernel or the MPI or OpenMP libs ar not optimal, you can try several methods to control the placement in order to improve perf of your code. Using the same placement from run to run also has the added benefit of reducing runtime variability.

 

 

Pay attention to maximizing data locality while minimizing latency and resource contention, and have a clear understanding of the characteristics of your own code and the machine that the code is running on.

 

 

taskset

linux환경에서 CPU pinning 지원하는 커맨드임. 현재 실행중인 프로세스의 pinning정보확인, 설정 가능.https://man7.org/linux/man-pages/man1/taskset.1.html

taskset [options] mask command [argument...]
taskset [options] -p [mask] pid

#taskset -pc 13131
#->pid 13131's current affinity list: 0-3

#taskset -pc 0-4 13131
#->pid 13131's current affinity list: 0-3
#->pid 13131's new affinity list: 0-4

OPTIONS    

       -a, --all-tasks
           Set or retrieve the CPU affinity of all the tasks (threads)
           for a given PID.

       -c, --cpu-list
           Interpret mask as numerical list of processors instead of a
           bitmask. Numbers are separated by commas and may include
           ranges. For example: 0,5,8-11.

       -p, --pid
           Operate on an existing PID and do not launch a new task.

       -h, --help
           Display help text and exit.

       -V, --version
           Print version and exit.

궁금증

프로세스 실행 전에 Pinning하고싶으면 어떻게 하지

multiprocessing(32) 로 했는데 그러면 taskset -pc 0-31 ### ?

아 내가 옛날에 찾아서 정리해놓은 자료를 발견했다

https://anaelle.tistory.com/43

 

[Linux] 쉘스크립트 background 실행 및 Core pinning 방법

쉘스크립트 background 실행방법 3 가지 1) 실행 명령어 뒤에 & 붙이기 ex) ./startcol.sh& 2) nohup 명령어 이용하기 ex) nohup /scratch/s5104a11/jwpyo/collect/collect_master_5sec.sh > /dev/null 2>&1 & 세션이 종료되어도 백그

anaelle.tistory.com

 

numactl

멀티프로세싱환경에서 shared memory 또는 file에 대한 policy 지정 가능

 

https://linux.die.net/man/8/numactl

 

numactl(8) - Linux man page

numactl(8) - Linux man page Name numactl - Control NUMA policy for processes or shared memory Synopsis numactl [ --interleave nodes ] [ --preferred node ] [ --membind nodes ] [ --cpunodebind nodes ] [ --physcpubind cpus ] [ --localalloc ] [--] command {arg

linux.die.net

https://knight76.tistory.com/entry/numactl-커맨드

 

numactl 커맨드

numa 아키텍처(Non-Uniform Memory Access, NUMA)는 멀티프로세서 시스템에서 사용되고 있는 컴퓨터 메모리 설계 방법중의 하나로, 메모리에 접근하는 시간이 메모리와 프로세서간의 상대적인 위치에 따

knight76.tistory.com

 

 

 

numactl vs taskset

There are two major tools out there for controlling NUMA access through command line (bash script). In this article, we focus on NUMACTL, but in practice, “taskset” is also often used.

https://yunmingzhang.wordpress.com/2015/07/22/numactl-notes-and-tutorialnumactl-localalloc-physcpubind04812162024283236404448525660646872768084889296100104108112116120124/

 

lscpu

To figure out which numbers correspond to what cores, the easiest way is to use

lscpu” , which gives the following information (among many other information)

NUMA node0 CPU(s):     0-11,24-35

NUMA node1 CPU(s):     12-23,36-47

This shows that 0-11, 24-35 are the 24 hardware threads in the 12 cores NUMA node 0. 24-35 can be thought of as the hyper threads.

“taskset -c 0-11 command” uses the 12 cores in socket 1 (NUMA node 0) without using their hyper threads.

 

 

 

 

어렵군...

 

 

반응형