반응형
2022.07.22 - [Studying/Machine Learning] - [머신러닝] CNN 모델 구현 with Pytorch (CIFAR-10 dataset)
저번 포스트에 CNN모델을 직접 구현해봤는데
이번 포스트에는 사전 학습된 모델을 가져와서 분류해보겠다.
VGG-19
VGG-19 모델 구조는 다음과 같다.
layer구조는 모델을 불러와서 직접 확인해보겠다.
Code
우선 패키지를 import 하고 dataset을 불러오겠다.
import torch
import torch.nn as nn
import torchvision.datasets as datasets
import torchvision.transforms as transforms
import torchvision.datasets as datasets
def CIFAR10_DATA(root='./data/', download = True, batch_size=32, num_worker=1):
print ("[+] Get the CIFAR10 DATA")
# 50000
train_dataset = datasets.CIFAR10(root='./data/', train=True, transform=transforms.ToTensor(), download=True)
# 10000
test_dataset = datasets.CIFAR10(root='./data/', train=False, transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
print ("[+] Finished loading data & Preprocessing")
return train_dataset, test_dataset, train_loader, test_loader
Model
import torchvision.models as models
pre_vgg=models.vgg19_bn(pretrained=True) #사전학습된 모델 불러오기
pre_vgg
# VGG(
# (features): Sequential(
# (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (2): ReLU(inplace=True)
# (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (5): ReLU(inplace=True)
# (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (8): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (9): ReLU(inplace=True)
# (10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (12): ReLU(inplace=True)
# (13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (15): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (16): ReLU(inplace=True)
# (17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (19): ReLU(inplace=True)
# (20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (22): ReLU(inplace=True)
# (23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (25): ReLU(inplace=True)
# (26): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (27): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (28): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (29): ReLU(inplace=True)
# (30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (31): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (32): ReLU(inplace=True)
# (33): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (34): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (35): ReLU(inplace=True)
# (36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (38): ReLU(inplace=True)
# (39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# (40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (42): ReLU(inplace=True)
# (43): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (44): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (45): ReLU(inplace=True)
# (46): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (48): ReLU(inplace=True)
# (49): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
# (50): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
# (51): ReLU(inplace=True)
# (52): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
# )
# (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
# (classifier): Sequential(
# (0): Linear(in_features=25088, out_features=4096, bias=True)
# (1): ReLU(inplace=True)
# (2): Dropout(p=0.5, inplace=False)
# (3): Linear(in_features=4096, out_features=4096, bias=True)
# (4): ReLU(inplace=True)
# (5): Dropout(p=0.5, inplace=False)
# (6): Linear(in_features=4096, out_features=1000, bias=True)
# )
# )
확인해보니 VGG-19 모델 그림에서 각 Conv layer는 conv-bnorm-relu의 묶음이었다.
16개는 convolution layer고, 3개는 fully connected layer, 총 19개로 이루어져 있다.
모델을 CIFAR-10 데이터에 맞게 바꿔주겠다.
class pretrained_vgg(nn.Module):
def __init__(self, pre_vgg):
super(pretrained_vgg, self).__init__()
self.features = nn.Sequential(
*list(pre_vgg.features.children())# 앞의 conv layer들을 다 가져옴
)
# FC Layer를 CIFAR10 데이터에 맞게 변환
self.linear = nn.Sequential(
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 128),
nn.ReLU(),
nn.Linear(128, 10),
)
def forward(self, x):
out = self.features(x).squeeze()
out = self.linear(out)
return out
Training
# set hyperparameters
batch_size = 32
learning_rate = 0.0001
num_epochs = 4
vgg19 = pretrained_vgg(pre_vgg)
vgg19 = vgg19.cuda() # 빠른 학습을 위해 GPU에 올림
criterion = nn.CrossEntropyLoss() # Loss Function = cross entrophy loss
optimizer = torch.optim.Adam(vgg19.parameters(), lr=learning_rate)
accuracy_list = []
for epoch in range(num_epochs):
vgg19.train()
for i, (images, labels) in enumerate(train_loader):
images = images.cuda()
labels = labels.cuda()
# Forward + Backward + Optimize
optimizer.zero_grad()
outputs = vgg19(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if (i+1) % 100 == 0:
print ('Epoch [%d/%d], Iter [%d/%d] Loss: %.4f'
%(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.item()))
결과를 확인해보면 확실히 학습 양에 비해 loss가 적은 것을 볼 수 있다.
이렇게 target 이미지와 학습된 이미지가 비슷하면
약간의 구조만 맞춰주어서 사전학습 모델을 이용할 수 있다.
VGG모델 말고도 다른 모델들을 불러올 수 있는데
각자 컴퓨터 사양에 따라 너무 큰 모델은 학습시키는데 오래 걸리므로
적당한 모델을 가져오는 게 좋다.
반응형
'Studying > Machine Learning' 카테고리의 다른 글
[머신러닝] 파이썬 정규 표현식 regex (1) | 2022.08.01 |
---|---|
[머신러닝] Style transfer - 스타일 변환 (2) | 2022.07.27 |
[머신러닝] Local optimum이란? (0) | 2022.07.25 |
[머신러닝] CNN 모델 구현 with Pytorch (CIFAR-10 dataset) (2) | 2022.07.22 |
[머신러닝] Convolution Layer - Padding, Stride, Dilation (4) | 2022.07.21 |
댓글