Tensorflow 딥러닝 준비 모음 with. CUDA

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
cuDNN의 버전을 확인할 수 있다. (위는 버전이 8 이상, 아래는 8 미만)

nvidia-smi
현재 GPU의 사용량을 체크할 수 있다. (정적 체크)

watch -n 1 nvidia-smi
GPU의 사용량을 동적으로 체크할 수 있다. (1초마다 체크)

htop
CPU와 메모리의 사용량을 동적으로 체크할 수 있다.

파이썬 코드

텐서플로우 코드를 실행할 수 있는 하드웨어를 찾는다.

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

텐서플로우 버전을 출력하고 GPU가 잘 동작하는지 체크한다.

import tensorflow as tf
tf.__version__
tf.test.is_gpu_available()

gpu_available이 False면
‘~/.bash_profile’에 아래 내용을 입력하고 ‘source ~/.bash_profile’을 명령한다. (cuda:10.0 기준)
오류가 계속 발생 시 [링크]를 참조한다.

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64

학습 데이터를 VRAM에 효율적으로 저장해서 ‘Out of Memory’를 덜 발생시킨다.
-> [권장] 소스의 맨 앞에 삽입

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try: # Currently, memory growth needs to be the same across GPUs 
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e: # Memory growth must be set before GPUs have been initialized print(e)
        print(e)

프로세스를 종료해서 VRAM을 비운다.
-> [권장] 소스의 맨 뒤에 삽입

import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

SHA Computing

How Sunghyun handles computer

Tensorflow 딥러닝 준비 모음 with. CUDA

Leave a Reply Cancel reply