Day13 학습정리 - Convolutional Neural Networks

부스트캠프 AI Tech 1기 [T1209 최보미]/U stage

Day13 학습정리 - Convolutional Neural Networks

B1001101 2021. 2. 3. 23:52

강의 복습

1. CNN

1) Convolution

Continuous convolution

Discrete convolution

2D image convolution

2) Convolutional Neural Networks

convolutional layer: 각 필터당 하나의 feature map이 형성되고, 그 feature map을 스택처럼 쌓아둔 것
pooling layer: 차원 축소
- Max pooling layer: window에 포함된 픽셀들의 최대값
- Average pooling later: window에 포함된 픽셀들의 평균값
fully connected layer: decision making (e.g. classification)

3) Convolution Arithmetic

Stride

Padding

파라미터 수 계산: K²CM
- K: 커널 크기
- C: input 채널 수
- M: output 채널 수

4) 1X1 Convolution

Dimension reduction
To reduce the number of parameters while increasing the depth
e.g. bottleneck architecture

2. Modern CNN

1) ILSVRC

ImageNet Large-Scale Visual Recognition Challenge
Classification / Detection / Localization / Segmentation
1000 different categories
Over 1 million images
Training set: 456,567 images

2) AlexNet

구조
- 11x11x3 filters
- 5 convolutional layers + 3 dense layers
파라미터 개수: 60M

Key ideas
- ReLU activation
- GPI implementation (2 GPUs)
- Local response normalization, Overlapping pooling
- Data augmentation
- Dropout

3) VGGNet

increasing depth with 3x3 convolution filters (with stride 1)
1x1 convolution for fully connected layers
Dropout (p=0.5)
VGG16, VGG19
파라미터 개수(19-layers): 110M

4) GoogLeNet

22 layers
파라미터 개수: 4M
network-in-network + inception blocks
inception block의 장점: 파라미터 개수 감소 (1x1 convolution)

5) ResNet

Add an identity map after nonlinear activations
- skip connection: f(x) → x + f(x)
Batch normalization after convolutions
Bottleneck architecture

6) DenseNet

addition 대신 concation 사용

Dense Block
- Each layer concatenates the feature maps of all preceding layers
- The number of channels increases geometrically
Transition Block
- BatchNorm → 1x1 Conv → 2x2 AvgPooling
- Dimension reduction

3. Computer Vision Applications

1) Semantic Segmentation

2) Fully Convolutional Network

CNN	Fully Convolutional Network

Convolutionalization: CNN → Fully Convolutional Network
fully connected layer를 convolution layer로 바꾸면 output으로 heatmap 출력됨

3) Deconvolution (conv transpose)

Convolution	Deconvolution

FCN의 output dimension은 일반적으로 subsampling으로 인해 감소 → upsampling 위해 deconvolution 사용

4) Detection

R-CNN

input image
Selective search 사용해서 2000개의 region proposal 추출
AlexNet 사용해서 각 proposal에 대한 feature 계산
linear SVM 사용해서 분류

SPPNet: CNN 한 번만 수행

Fast R-CNN

input image, bounding boxes
Generated convolutional feature map
각 region에 대해 ROI pooling 사용해서 length feature 구함
output: class, bounding-box regressor

Faster R-CNN: Regional Proposal Network + Fast R-CNN

Region Proposal Network

Anchor boxes: detection boxes with predefined sizes
9: 3 region sizes x 3 ratios
4: bounding box regression parameters
2: box classification
YOLO

여러개의 bounding box와 class probability 동시에 예측
SxS grid로 나눔
각각의 cell은 B개의 bounding boxes와 C개의 class probabilities 예측
- 각각의 bounding box는 box refinement(x/y/w/h), confidence 예측
최종 결과: SxSx(B*5+C) size의 tensor

4. 데이터셋 다루기

1) 강아지 종류 분류하기

강아지 데이터 다운받기: vision.stanford.edu/aditya86/ImageNetDogs/main.html에서 Images 선택

2) 나만의 데이터셋 만들기

구글 이미지 다운로드
- 설치: pip3 install --upgrade git+https://github.com/Joeclinton1/google-images-download
- 사용법: googleimagesdownload --keywords "키워드1, 키워드2, ..." --limit 사진개수 --format 형식 --output_directory 저장경로
한 번에 100장 이상 저장할 경우
- chromedriver.chromium.org/에서 자신의 크롬 버전에 맞는 크롬드라이버 다운받은 후 압축 해제
- mac 기준으로 설치한 chromedriver 를 /usr/local/bin/ 로 옮김 (mv chromedriver /usr/local/bin/)
- 이미지 다운받을 때 -cd 혹은 --chromedriver 옵션에 chromedriver 위치를 명시해줌
  googleimagesdownload --keywords "키워드1, 키워드2, ..." --limit 사진개수 --format 형식 --output_directory 저장경로 -cd /usr/local/bin/chromedriver

피어세션

오늘 피어세션에서는 Convolution의 종류에 대해 발표했다. 조사하면서 Convolution의 종류가 이렇게 많은지 처음 알았다. 오늘 강의 내용 중에 내가 조사했던 Deconvolution이 있어서 반가웠다.

Convolution 종류.pdf

0.59MB

코멘트

오늘은 CNN에 대해 배웠다. 저번주까지는 파이썬 기초랑 수학만 배워서 못 느꼈는데 이번주는 딥러닝 이론도 배우고 직접 코드도 돌려보니까 진짜 딥러닝 공부하는 기분이 든다.

저작자표시