강의 복습 내용

얻은 지식

data들은 편향(bias)되어있다.
- 특정구도로 보기좋은 방식으로 이미지를 얻었기때문이다.
- training data는 카메라로 찍은것으로 real data와는 다르다.
- 밝은 이미지, 어두운 이미지의 차이를 못알아 볼수있다.
data를 일부 변화를 해준다.
- 자르기, 밝기, 변형, 회전 등
- NumPy에 다양한 methods가 존재
최고의 augmentation을 찾기
- random하게 augmentaion을 적용하여 성능이 좋은것을 탐색
  - RandAugment
- 어떤 aumentation을 할지 , 어느정도로 적용할지

데이터수가 많이필요하고 lable도 필요하다.
- 단기간에 어렵다
- 사람이 하는일이라 원하는 결과를 못얻을 수 있다.
기존에 학습한 사전지식을 활용하여 연관된 새로운 task를 얻는것
- 데이터 셋으로 학습하여 또다른 데이터셋에서 활용
- 한 데이터셋에서 다른데이터셋에서 사용될 공통된 지식이 있다.
방법1
- - 10개의 dimenstion을 출력하는 사전에 학습하고
  - convolution은 고정, fully connected layer를 교체하여 100개의dimension을 출력하는 새로운 모델을 학습한다.
방법2
- - 미리 학습하는것은 같고, convolution은 가중치를 낮게, fully connected는 가중치를 높게 learning late를 준다.

전체데이터셋에서 label된 데이타셋은 일부에 불과하다.
unlabeled 데이터를 목적성있게 잘 사용하는것
semi-supervised learning은 label된 데이타와 label이 안된 데이터을 활용하여 학습
- label 데이터를 학습
- 학습된 모델을 이용해 unlabel 데이타를 pseudo-labeled 데이터를 얻는다.
- pseudo-labeled데이터와 label데이터 두가지를 이용해 다시 학습을 진행한다.

Yun et al.,CutMix:Regularization Strategy to Train Strong Classifiers with Localizable Features,ICCV2019
Cubuk et al.,Randaugment:Practical automated data augmentation with a reduced search space,CVPRW2020
Ahmed et al.,Fusion oflocal andglobal features for effective image extraction,AppliedIntelligence2017
Oquab et al.,LearningandTransferringMid-LevelImageRepresentationsusingConvolutionalNeuralNetworks,CVPR2015
Hinton et al.,Distilling the Knowledge in a Neural Network,NIPSdeep learning workshop2015
Li &Hoiem,Learning without Forgetting,TPAMI2018
Lee,Pseudo-label:ThesimpleandEfficientSemi-SupervisedLearningMethodforDeepNeuralNetworks,ICMLWorkshop2013
Xie et al.,Self-training with Noisy Student improves ImageNet classification,CVPR2020

size와 shape의 차이
- numpy는 flat해서 나온다.
- torch는 동일
  - size(), shape이다.
  - size를 연결해주는 것인듯
pytorch hook이 존재
- Model안에 존재
6시반 과제해설
- 3시부터 5시15분
  과제
accuracy
- mlp를 참고하심