강의 복습 내용
multimodal
visual data and text
joint embedding
image tagging
metric learning
cross modal translation
image captioning
show and tell
show attend and tell
text to image by generative model
cross modal reasoning
visual question answering
visual data and audio
audio 처리
spectrogram
Fourier trasform
STFT(short time fourier transform)
melspectrogrma
MFC
join embedding
soundnet
cross modal translation
speech2face
image-tospeech
sound source localization
looking to listen at the cocktail party
lib movements generation
3d
데이터형식
multi-view images
volumetric(3d공간을 픽셀처럼 생각)
part assembly(작은 부품의 집합)
point cloud(3d포인트의 집합)
mesh(graph cnn에 사용,vertex와 edge의 형태로 삼각형모양)
implicit shape(고차원의 함수 형태)
데이터셋
shapeNet
partnet
scenenet
scannet
outdoor 3d scene datasets
3d task
3d recognition
3d semantic segmentation
conditional 3d generation
mesh r-cnn
강의 복습 내용
object detection
R-CNN
Fast R-CNN
Faster R-CNN
YOLO
SSD
Focalloss
RetinaNet
DETR
cnn visualization
dimensionality reduction
analsis of model behaviors
Nearest neighbors
t-SNE
layer activation
maximally activation patches
class visualization
gradient ascent
model decision explanation
saliency test
via backpropagation
rectified unit
guided backpropagation
CAM(class activation mapping)
global average pooling(GAP)
Grad-CAM
Autograd
강의 복습 내용
image classification
이미지 분류
annotation data efficient learning
데이터 변형
augmentation
pre-trained information
사전에 미리 학습한것을 사용하여 효율적으로 학습
transfer learning
convolution으로 feature를 따로 학습한것을 활용
knowledge distillation
선생이되는 모델을 이용해 다른 모델을 학습
라벨이 안된 데이터 학습
semi-supervised
self training