GradCAM

2022. 3. 7. 15:35·AI paper review/Explainable AI

1. What is the goal of GradCAM??

The goal of GradCAM is to produce a coarse localization map highlighting the important regions in the image for predicting the concept (class).

Fig 1. Explanation of GradCAM

GradCAM uses the gradients of any target concept (such as "cat") flowing into the final convolutional layer.

Note: I (da2so) will only deal with the problem of image classification in the following contents.

Fig 2. The overall procedure of GradCAM

 

The property of feature map \( A^k \) from the last convolution layer: We expect the last convolution layer to have the best comprise between high-level semantics and detailed spatial information.

Obtaining the neuron importance weights \( w^{c}_k=\frac{1}{z}\sum_i\sum_j\frac{\partial y^c}{\partial A^{k}_V} \), where \( V \) is \(i, j\).

This weight represents a partial linearization of the deep network downstream from \( A \), and captures the 'importance' of feature map \(k\) for a target class \(c\).

Then, we perform a weighted combination of forward activation maps and follow it with a ReLU.
\[\text{Grad-CAM} \quad L^c=ReLu(\sum_k w^c_k A^k) \quad \quad \cdots Eq.(1)\]

The reason for applying ReLU is that we are only interested in the features that have a positive influence.

 

In summary, the procedure of GradCAM is followed.


  • Input
    • Image: \(x\)
    • Pre-trained model: \(f\)
      • Feature extractor (CNN): \(f_e\)
      • Classification layer (fc layer): \(f_l\)
    • Category (target class): \(c\)

  1. \(A \leftarrow f_e(x)\)
  2. \(y^c \leftarrow f_l(A)\)
  3. \(w^c_k \leftarrow \frac{1}{z}\sum_i\sum_j\frac{\partial y^c}{\partial A^{k}_V}\)
  4. \(L^c \leftarrow ReLu(\sum_k w^c_k A^k)\)

  • Output
    • Grad-CAM: \(L^c\)

 

Evaluating Trust

Given two prediction explanations, they evaluate which seems more trustworthy between Guided Backpropagation and Guided Grad-CAM visualizations. For experiments, they use  AlexNet and VGG-16oting that VGG-16 has more accurate than AlexNet with an accuracy of 79.09 mAP (vs. 69.20 mAP) on PASCAL classification. Trust scores are obtained from 54 humans. With Guided Backpropagation, humans assign VGG-16 an average score of 1.00 which means that it is more trustworthy than AlexNet, while Guided Grad-CAM achieves a higher score of 1.27 which means that VGG-16 is clearly more reliable. 

Reference

Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization." Proceedings of the IEEE international conference on computer vision. 2017.

Github Code: Grad-CAM

반응형

'AI paper review > Explainable AI' 카테고리의 다른 글

Counterfactual Explanation Based on Gradual Construction for Deep Networks  (0) 2022.03.10
A Disentangling Invertible Interpretation Network for Explaining Latent Representations  (0) 2022.03.10
Interpretable And Fine-grained Visual Explanations For Convolutional Neural Networks  (0) 2022.03.09
Interpretable Explanations of Black Boxes by Meaningful Perturbation  (0) 2022.03.07
'AI paper review/Explainable AI' 카테고리의 다른 글
  • Counterfactual Explanation Based on Gradual Construction for Deep Networks
  • A Disentangling Invertible Interpretation Network for Explaining Latent Representations
  • Interpretable And Fine-grained Visual Explanations For Convolutional Neural Networks
  • Interpretable Explanations of Black Boxes by Meaningful Perturbation
Sin-Han Kang
Sin-Han Kang
Explainable AI (XAI), Model Compression, Image and Video Encoding and NAS
    250x250
  • Sin-Han Kang
    da2so
    Sin-Han Kang
  • 전체
    오늘
    어제
    • 분류 전체보기 (78)
      • AI Engineering (40)
        • TensorFlow (10)
        • PyTorch (6)
        • MLOps (15)
        • NVIDIA (5)
        • OpenVINO (3)
      • AI paper review (6)
        • Explainable AI (5)
        • Model Compression (10)
        • Mobile-friendly (7)
      • Computer Science (6)
      • 일상 (4)
  • 블로그 메뉴

    • Home
    • About me
    • Guest book
  • 링크

  • 공지사항

  • 인기 글

  • 태그

    Explainable AI
    pytorch
    OpenVINO
    docker
    Airflow
    Python
    Model Compression
    TensorFlow.js
    object detection
    Mediapipe
    TFLite
    kubernetes
    style transfer
  • 최근 댓글

  • 최근 글

  • hELLO· Designed By정상우.v4.10.3
Sin-Han Kang
GradCAM
상단으로

티스토리툴바