1. Introduction
Style transfer: Synthesizing an image with content similar to a given image and style similar to another.

1.1 Motivation
There are two main problems in style transfer on the existing methods. (i) The first weak point is that they generate only one stylization for a given content/style pair. (ii) One other issue of them is their high sensitivity to the hyper-parameters.
1.2 Goal
To solve these problems, the authors provide a novel mechanism that allows adjustment of crucial hyper-parameters, after the training and in real-time, through a set of manually adjustable parameters.
2. Background
2.1 Style transfer using deep networks
Style transfer can be formulated as generating a stylized image
The similarity in style can be vaguely defined as sharing the same spatial statistics in low-level features of a network (use VGG-16 in this paper), while similarity in content is roughly having a close Eculidean distance in high-level features.
The main idea is that the features obtained by the network contain information about the content of the input image while the correlation between these features represents its style.
In order to increase the similarity between two images, minimize the following distances between their extracted features:
where
: Activation of a pre-trained network at layer . : Given the input image.
: Content and style loss at layer respectively. : Gram matrix associated with .- Gram matrix
: Variance of RGB between image textures.
- Gram matrix
The total loss is calculated as a weighted sum of losses a set of content layers
where
Finally, the objective of style transfer can be defined as:
2.2 Real-time feed-forward style transfer
We can solve the objective in Eq. (4) using the iterative method but it can be very slow and has to be repeated for any given input.
A much faster method is to directly train a deep network
The second problem from this is that this generates only one stylization for a pair of style and content images.
3. Proposed Method

To address the two issues that are mentioned above, the authors condition the generated stylized image on additional input parameters where each parameter controls the share of the loss from a corresponding layer.
As for figural style, they enable the users to adjust
To learn the effect of
This method transforms the activations of a layer
where
Since
where
4. Experiment setting
They trained
Similar to previous approaches, they used the last feature set of conv3 as content layer
5. Experiment

Reference
Babaeizadeh, Mohammad, and Golnaz Ghiasi. "Adjustable real-time style transfer." arXiv preprint arXiv:1811.08560 (2018).
'AI paper review > Mobile-friendly' 카테고리의 다른 글
[MobileOne] An Improved One millisecond Mobile Backbone 논문 리뷰 (0) | 2022.06.25 |
---|---|
EfficientFormer: Vision Transformers at MobileNet Speed 논문 리뷰 (1) | 2022.06.08 |
Lite Pose 논문 리뷰 (0) | 2022.04.18 |
MobileViT 논문 리뷰 (0) | 2022.03.28 |
EfficientNetv2 논문 리뷰 (0) | 2022.03.24 |