1. Interpreting hidden representations
1.1 Invertible transformation of hidden representations

- Input image:
- Sub-network of
including hidden layers: - Latent (original) representation:
- Sub-network after the hidden layer:
In A Disentangling Invertible Interpretation Network for Explaining Latent Representations, the goal is to translate an original representation
In order to

So, what can we obtain from interpretable representations??
(i) We can comprehend the meaning of the latent variable from the value of interpretable representation and get more natural images from GAN.

(ii) Semantic image modifications and embeddings. As we manipulate the semantic factor (=interpretable representation) for digit, we can get '3' digit image from '9' digit image.

1.2 Distangling interpretable concepts
What properties of
A. It must represent a specific interpretable concept. (ex: color, class ...)
B. It is independent of each other.
C. Distribution
D. Interpolation between two samples of a factor must be valid samples to analyze changes along a path
Satisfy C. and D.
The way to supply additional constraints for Eq. (1).
Let there be training image pairs
Each semantic concept
However, we cannot expect to have examples of image pairs for every semantic concept relevant in
For a given training pair

To fit this model to data, we utilize the invertibility of
Compute the likelihood with the absolute value of the Jacobian determinant of

Build
For training, we use negative log-likelihood as our loss function.
Subsitituting Eq. (1) into Eq. (4), Eq. (2) and (3) into Eq.(5) leads to the per-example loss

The loss function is optimized over training pairs
where
2. Obtaining semantic concepts
2.1 Estimating dimensionality of factors
Given image pairs
Semantic concepts captured by the network
So, Approximate their mutual information with their correlation for each component
Summing over all components
Since, correlation is in

2.2 Sketch-based description of semantic concepts
*Problem: * Most often, a sufficiently large number of image pairs is not easy to obtain.
*Solution: * A user only has to provide two sketches,

2.3 Unsupervised Interpretations
Even without examples for changes in semantic factors, our approach can still produce disentangled factors, In this case, we minimized the negative log-likelihood of the marginal distribution of hidden representations

Reference
Esser, Patrick, Robin Rombach, and Bjorn Ommer. "A disentangling invertible interpretation network for explaining latent representations." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
'AI paper review > Explainable AI' 카테고리의 다른 글
Counterfactual Explanation Based on Gradual Construction for Deep Networks (0) | 2022.03.10 |
---|---|
Interpretable And Fine-grained Visual Explanations For Convolutional Neural Networks (0) | 2022.03.09 |
Interpretable Explanations of Black Boxes by Meaningful Perturbation (0) | 2022.03.07 |
GradCAM (0) | 2022.03.07 |