Tensorflow 기초 : 모듈, 레이어, 모델 클래스 구조 알아보기

AI/Fundamental

Tensorflow 기초 : 모듈, 레이어, 모델 클래스 구조 알아보기

방황하는 데이터불도저 2024. 3. 27. 19:54

Tensorflow는 가장 흔하게 사용되는 딥러닝 프레임워크로 그 구조를 알아보고, 어떻게 활용할 수 있는지 공식 document를 보고 내용을 정리해보았습니다.

흔히 신경망모델에서의 레이어는 수학적인 구조로 이루어진 함수입니다. 딥러닝에서는 이 레이어에서 weights, bias와 같은 가중치(trainable variables)를 가지고있고, 이들이 적합한 값으로 학습되도록 하는 과정을 거칩니다. Tensorflow에서는 이러한 레이어를 함수로 재사용하고, 가중치를 저장했다가 로드할 수 있는 기능들을 제공합니다.

이를 지원하는 Tensorflow의 클래스는 tf.Module입니다. 간단한 사용 예시를 보겠습니다.

tf.Module을 상속받는 클래스는 아래의 두 함수로 정의할 수 있습니다.

__init__ 함수 : 연산의 input 값으로 들어갈 변수들을 선언해주는 부분
__call__함수 : __init__함수에서 정의해준 변수(input)들을 가지고 연산을 정의해주는 부분

import tensorflow as tf

class SimpleModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.a_variable = tf.Variable(5.0, name="train_me")
    self.non_trainable_variable = tf.Variable(5.0, trainable=False, name="do_not_train_me")
  def __call__(self, x):
    return self.a_variable * x + self.non_trainable_variable

simple_module = SimpleModule(name="simple")

simple_module(tf.constant(5.0))

# result
# 2023-10-18 01:21:08.181350: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
# Skipping registering GPU devices...
# <tf.Tensor: shape=(), dtype=float32, numpy=30.0>

위의 경우처럼 레이어에 들어가는 input size를 미리 정의하지 않고 싶다면, 아래와 같은 방식으로 구현이 가능합니다.

class Dense(tf.Module):
  def __init__(self, in_features, out_features, name=None):
    super().__init__(name=name)
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def __call__(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)
    
    
 class SequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)

  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)


# You have made a model!
my_model = SequentialModule(name="the_model")

# Call it, with random results
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

# result
# Model results: tf.Tensor([[0.       3.415034]], shape=(1, 2), dtype=float32)

여기에서 사용되는 tf.Variable 또한 연산 작업을 통해 값이 변경될 수 있는 Tensor형태로 텐서플로우 구조상 중요한 클래스 입니다. tf.Variable의 기능은 모델의 파라미터를 저장하고, 변수를 변경하는 다양한 연산 기능이 있습니다. 자세한 사항은 해당 페이지에서 볼 수 있습니다.

tf.Module 인스턴스는 자동으로 이러한 모든 tf.Variable를 모으기 때문에 이를 통해 모델 내의 여러 tf.Module들과 tf.Variable들을 관리할 수 있게 됩니다. 이에 따라 아래와 같은 기능들이 가능해집니다.

# All trainable variables
print("trainable variables:", simple_module.trainable_variables)
# Every variable
print("all variables:", simple_module.variables)

# result
# trainable variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>,)
# all variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>, <tf.Variable 'do_not_train_me:0' shape=() dtype=float32, numpy=5.0>)

print("Submodules:", my_model.submodules)
# Submodules: (<__main__.Dense object at 0x7fc324244640>, <__main__.Dense object at 0x7fc21807c8e0>)

for var in my_model.variables:
  print(var, "\n")
# result
# <tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)> 
# 
# <tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
# array([[-0.22808331,  0.29274654,  0.6080226 ],
#       [-1.1041229 , -0.5975617 , -0.7721161 ],
#       [ 0.4206435 ,  0.31748644,  0.34665376]], dtype=float32)> 
# 
# <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 
# 
# <tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
# array([[-1.2076675 ,  1.7211282 ],
#       [-0.22784151,  0.5912422 ],
#       [-0.9780839 , -1.3760505 ]], dtype=float32)>

또한, tf.Module을 통해서 모든 파라미터(=가중치=tf.Variable objects)를 저장하는 checkpoint 형태나 모델의 연산구조까지 저장하는 SavedModel 형태로 저장할 수 있는 기능을 제공합니다.

Tensorflow Serving이나 Tensorflow Lite에 정의된 것처럼 Python objects없이 모델을 돌리거나, Tensorflow Hub에서 미리 학습된 모델을 다운받기 위해서는 tf.Module은 필요한 기반 클래스입니다. (이때는 graph라는 개념과 @tf.function 데코레이터가 사용되는데, 자세한 사항은 Introduction to graphs and tf.function 에서 따로 볼 수 있습니다. 해당 내용에 대해서는 다음 글에서 다루어 보겠습니다.)

Tensorflow에서 Keras는 tf.Module 클래스 상위에 high-level API로 구축된 모델 레이어들을 제공합니다.

Keras API는 딥러닝에서 자주 사용되는 다양한 종류의 레이어들(dense, convolution, pooling, normalization,,,)뿐만 아니라 Optimizer, Metrics, Losses 등의 다양한 도구들을 지원하고 있습니다. 이 또한 저장 및 로드가 가능하고, 해당 레이어가 training step인지 inference step인지 구별이 가능해서 재사용도 용이합니다. Keras API documentation

* keras, sonnet 은 다양한 목적의 신경망 모델을 제공하는 라이브러리로 해당 모듈을 가져와 학습, 추론 등의 과정을 편리하게 사용할 수 있다.

tf.keras.layers.Layer는 케라스에서 제공하는 레이어의 기본 클래스입니다. 이 클래스는 tf.Module로부터 상속받습니다. 부모 클래스를 변경하고, __call__을 call로 변경하면 모듈을 Keras 레이어로 변경할 수 있습니다.

class MyDense(tf.keras.layers.Layer):

  def __init__(self, in_features, out_features, **kwargs):
    super().__init__(**kwargs)
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
    
  def call(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

simple_layer = MyDense(name="simple", in_features=3, out_features=3)

simple_layer([[2.0, 2.0, 2.0]])

특별히, keras 레이어에서는 build 함수를 제공합니다. 초기에 한번만 호출 되도록 하기 때문에 input size를 들어오는 값에 따라서 정의하고, 다음 실행부터는 다시 선언되지 않도록 할 수 있습니다.

class FlexibleDense(tf.keras.layers.Layer):
  # Note the added `**kwargs`, as Keras supports many arguments
  def __init__(self, out_features, **kwargs):
    super().__init__(**kwargs)
    self.out_features = out_features

  def build(self, input_shape):  # Create the state of the layer (weights)
    self.w = tf.Variable(
      tf.random.normal([input_shape[-1], self.out_features]), name='w')
    self.b = tf.Variable(tf.zeros([self.out_features]), name='b')

  def call(self, inputs):  # Defines the computation from inputs to outputs
    return tf.matmul(inputs, self.w) + self.b

# Create the instance of the layer
flexible_dense = FlexibleDense(out_features=3)

빌드 전

flexible_dense.variables
# []

빌드 후

# Call it, with predictably random results
print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0], [3.0, 3.0, 3.0]])))

# Model results: tf.Tensor(
# [[-2.8120208  2.4438493  1.9408028]
#  [-4.2180314  3.6657739  2.9112043]], shape=(2, 3), dtype=float32)

flexible_dense.variables
# [<tf.Variable 'flexible_dense/w:0' shape=(3, 3) dtype=float32, numpy=
#  array([[-0.93394387,  0.07000035,  0.50755775],
#         [ 0.27968523,  0.8088689 ,  0.70035136],
#         [-0.7517518 ,  0.34305546, -0.23750767]], dtype=float32)>,
#  <tf.Variable 'flexible_dense/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

첫 호출 이후에 다시 호출하는 경우에는 build되지 않으므로 다른 shape의 입력이 들어가면 오류가 발생하게 됩니다.

try:
  print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0, 2.0]])))
except tf.errors.InvalidArgumentError as e:
  print("Failed:", e)
  
# Failed: Exception encountered when calling layer 'flexible_dense' (type FlexibleDense).
# 
# { {function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:GPU:0} } Matrix size-incompatible: In[0]: [1,4], In[1]: [3,3] [Op:MatMul]
# 
# Call arguments received by layer 'flexible_dense' (type FlexibleDense):
#   • inputs=tf.Tensor(shape=(1, 4), dtype=float32)

여러 Keras 레이어들을 모델로 사용할 때 tf.keras.Model은 (여러 기기에서의) 학습, 평가, 로드, 저장 등의 과정을 편리하게 제공하는 클래스입니다. 이 클래스는 tf.keras.layers.Layer로부터 상속받습니다. API Documentation

위에 정의했던 SequentialModule과 거의 비슷한 코드에서 부모클래스와 call 함수만 변경해주면 모델 클래스를 만들 수 있습니다.

class MySequentialModel(tf.keras.Model):
  def __init__(self, name=None, **kwargs):
    super().__init__(**kwargs)

    self.dense_1 = FlexibleDense(out_features=3)
    self.dense_2 = FlexibleDense(out_features=2)
  def call(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a Keras model!
my_sequential_model = MySequentialModel(name="the_model")

# Call it on a tensor, with random results
print("Model results:", my_sequential_model(tf.constant([[2.0, 2.0, 2.0]])))

굉장히 Pythonic한 방식으로도 모델을 정의할 수 있습니다. 이 방법이 가장 흔하게 볼 수 있는 Tensorflow의 모델 선언 방식입니다.

inputs = tf.keras.Input(shape=[3,])

x = FlexibleDense(3)(inputs)
x = FlexibleDense(2)(x)

my_functional_model = tf.keras.Model(inputs=inputs, outputs=x)

my_functional_model.summary()

여기까지 텐서플로우에서 모델을 구성할 때 사용되는 기본적인 3가지 클래스 tf.Module, tf.keras.layers.Layer, tf.keras.Model 에 대해서 알아보았습니다. 더 자세하고, 구체적인 활용법들은 아래와 같은 사이트에서 참고할 수 있습니다.

Making new layers and models via subclassing

Making new layers and models via subclassing | TensorFlow Core

Complete guide to writing $Layer$ and $Model$ objects from scratch.

www.tensorflow.org

Keras: The high-level API for TensorFlow

Keras | TensorFlow Core

Keras 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요. tf.keras는 딥 러닝 모델을 빌드하고 학습시키기 위한 TensorFlow의 상위 수준 API입니다. 또한 신속한 프로토

www.tensorflow.org

'AI > Fundamental' 카테고리의 다른 글

프롬프트 엔지니어링이란? Prompt Engineering, in-context learning (Zero, One, Few-shot) (0)	2024.05.07
[NLP] 자연어처리 딥러닝 모델 변천과정 (RNN, LSTM, Seq2Seq, Attention, Transformer) (2)	2024.04.16
Transformer 모델 - Positional Encoding 포지셔널 인코딩에 대해서 알아보자. (1)	2023.12.28
딥러닝에서 Multi Granularity Feature (다중 세분화 기능) 이란? (0)	2023.12.15
분류모델 성능평가 지표 (2) : PR curve, mAP (mean Average Precision) (0)	2023.11.02

현재글Tensorflow 기초 : 모듈, 레이어, 모델 클래스 구조 알아보기

AI와 데이터의 모든 것 주니어입니다. 겸손하게 불도저처럼 나아가겠습니다☄️

주니어입니다. 겸손하게 불도저처럼 나아가겠습니다☄️

모두를위한선형대수학, TensorFlow, 인공지능, 딥러닝, 파이썬, 리눅스, 머신러닝, 선형대수, 벡터, 선형대수학, 텐서플로우, 데이터, ML, tensor, 부스트코스, 신경망모델, 칸아카데미, Linux, Python, linearalgebra,

Today :
Yesterday :

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

AI와 데이터의 모든 것