ML/DL Experiments and Analysis

A. Model & Module

Focal loss: CE에서 well-classified(easy sample)에 대해서는 loss를 더 작게 만들기 위해 \( (1 - p_t)^\gamma \)을 추가
- 아래는 label이 1인 경우의 loss식이며 \( p_t \)가 1에 가까울 수록 \( \gamma \)에 의해 loss가 exponential 하게 작아지게하여 상대적으로 easy sample의 loss를 CE때보다 급격히 줄임
- \( \gamma \)가 0일때 CE랑 같음

CIoU loss: 겹치는 영역(IoU), 중심점 사이의 거리, 종횡비 세가지 메트릭을 동시에 고려한 것 (DIoU의 확장버전)
- \( CIoU = 1- IoU + \frac{\rho^2(b, b^{gt})}{c^2} + \alpha v \)

CutMix (CVPR 2019): 모델이 객체의 차이를 식별할 수 있는 부분에 집중하지 않고, 덜 구별되는 부분 및 이미지의 전체적인 구역을 보고 학습도록 하여 일반화와 localization 성능을 높이는 방법.
- OOD(out-of-distribution)와 이미지가 가려진 sample, adversarial sample에서의 robustness도 좋은 성능

Pytorch training 최적화: 요기
Pytorch output slicing을 통한 loss 계산시(graident에 사용되는) 주의점
- If the shape of output is (4(B), 2(C), 320(W), 320(H)]
- Wrong → out1 = output[:, 0, :, :], out2 = output[:, 1, :, :]
- Correct → out1 = output[:, :1, :, :], out2 = output[:, 1:, :, :]
SOD에서 encoder(i.e. resnet, efficientnet)의 low level feature는 너무 많은 details을 가지는 반면에 high level feature는 rough한 결과를 내뽑음 (Reference: Pyramid Feature Attention Network for Saliency detection)
- Low level feature에는 detail이 많고 sod는 boundary를 찾는 게 목적이니 spatial attention을 사용
- High level feature에는 rough한 영역이 많으니 channel attention을 통해 high response를 내는 channel에 가중치 줌 (salient object찾는데 좋음)
- Low level feature + high level feature를 aggregation하고 channel attention, spatial attention한 논문이 Tracer(AAAI 2022)