Pixel-wise Dense Detector for Image Inpainting
Ruisong Zhang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorWeize Quan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorBaoyuan Wu
School of Data Science, the Chinese University of Hong Kong, Shenzhen, China
Secure Computing Lab of Big Data, Shenzhen Research Institute of Big Data, China
Search for more papers by this authorDong-Ming Yan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorRuisong Zhang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorWeize Quan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorBaoyuan Wu
School of Data Science, the Chinese University of Hong Kong, Shenzhen, China
Secure Computing Lab of Big Data, Shenzhen Research Institute of Big Data, China
Search for more papers by this authorDong-Ming Yan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190 China
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049 China
Search for more papers by this authorAbstract
Recent GAN-based image inpainting approaches adopt an average strategy to discriminate the generated image and output a scalar, which inevitably lose the position information of visual artifacts. Moreover, the adversarial loss and reconstruction loss (e.g., ℓ1 loss) are combined with tradeoff weights, which are also difficult to tune. In this paper, we propose a novel detection-based generative framework for image inpainting, which adopts the min-max strategy in an adversarial process. The generator follows an encoder-decoder architecture to fill the missing regions, and the detector using weakly supervised learning localizes the position of artifacts in a pixel-wise manner. Such position information makes the generator pay attention to artifacts and further enhance them. More importantly, we explicitly insert the output of the detector into the reconstruction loss with a weighting criterion, which balances the weight of the adversarial loss and reconstruction loss automatically rather than manual operation. Experiments on multiple public datasets show the superior performance of the proposed framework. The source code is available at https://github.com/Evergrow/GDN_Inpainting.
References
- Ballester C., Bertalmio M., Caselles V., Sapiro G., Verdera J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Trans. Image Process. 10, 8 (2001), 1200–1211. 1, 2
- Bertalmio M., Sapiro G., Caselles V., Ballester C.: Image inpainting. In Proc. ACM SIGGRAPH (2000), pp. 417–424. 1, 2
- Barnes C., Shechtman E., Finkelstein A., Goldman D. B.: PatchMatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3 (2009), 24. 1, 2
- Canny J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell., 6 (1986), 679–698. 6
- Chen D.-R., Wu Q., Ying Y., Zhou D.-X.: Support vector machine soft margin classifiers: error analysis. Journal of Machine Learning Research 5, Sep (2004), 1143–1175. 5
- Danon D., Averbuch-Elor H., Fried O., Cohen-Or D.: Unsupervised natural image patch learning. Computational Visual Media 5, 3 (2019), 229–237. 2
10.1007/s41095-019-0147-y Google Scholar
- Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L.: Imagenet: A large-scale hierarchical image database. In IEEE CVPR (2009), pp. 248–255. 3
- Darabi S., Shechtman E., Barnes C., Goldman D. B., Sen P.: Image Melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4 (2012), 1–10. 1, 2
- Doersch C., Singh S., Gupta A., Sivic J., Efros A. A.: What makes paris look like paris? ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4 (2012), 101:1–101:9. 6
- Drucker H., Schapire R., Simard P.: Boosting performance in neural networks. In Advances in Pattern Recognition Systems using Neural Network Technologies. World Scientific, 1993, pp. 61–75. 5
- Efros A. A., Freeman W. T.: Image quilting for texture synthesis and transfer. In Proc. ACM SIGGRAPH (2001), pp. 341–346. 1, 2
- Gatys L. A., Ecker A. S., Bethge M.: Image style transfer using convolutional neural networks. In IEEE CVPR (2016), pp. 2414–2423. 3
- Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y.: Generative adversarial nets. In Advances in Neural Information Processing Systems (2014), pp. 2672–2680. 1, 3
- Gruber M.: Unofficial implementation of “image inpainting for irregular holes using partial convolutions”. https://github.com/MathiasGruber/PConv-Keras. Accessed 5 Match 2020., 2019. 6
- Huang J.-B., Kang S. B., Ahuja N., Kopf J.: Image completion using planar structure guidance. ACM Trans. Graph. (Proc. SIGGRAPH) 33, 4 (2014), 1–10. 1, 2
- Hu S.-M., Liang D., Yang G.-Y., Yang G.-W., Zhou W.-Y.: Jittor: A noval deep learning framework with unified graph execution and meta operators. Science China-Information Sciences (2020). to appear. https://github.com/Jittor/Jittor. 11
- Heusel M., Ramsauer H., Unterthiner T., Nessler B., Hochreiter S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems (2017), pp. 6626–6637. 8
- He K., Zhang X., Ren S., Sun J.: Deep residual learning for image recognition. In IEEE CVPR (2016), pp. 770–778. 2, 4
- Iskakov K.: Quick draw irregular mask dataset. https://github.com/karfly/qd-imd. Accessed 5 Match 2020., 2018. 6
- Iizuka S., Simo-Serra E., Ishikawa H.: Globally and locally consistent image completion. ACM Trans. Graph. (Proc. SIGGRAPH) 36, 4 (2017), 1–14. 1, 2, 3
- Isola P., Zhu J.-Y., Zhou T., Efros A. A.: Image-to-image translation with conditional adversarial networks. In IEEE CVPR (2017), pp. 1125–1134. 3
- Johnson J., Alahi A., Fei-Fei L.: Perceptual losses for real-time style transfer and super-resolution. In ECCV (2016), pp. 694–711. 3
- Karras T., Aila T., Laine S., Lehtinen J.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017). 6
- Kingma D. P., Ba J.: Adam: A methodfor stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). 6
- Lin T.-Y., Goyal P., Girshick R., He K., Dollár P.: Focal loss for dense object detection. In IEEE ICCV (2017), pp. 2980–2988. 5, 10
- Liu H., Jiang B., Xiao Y., Yang C.: Coherent semantic attention for image inpainting. In IEEE ICCV (2019), pp. 4170–4179. 3
- Liu Z., Luo P., Wang X., Tang X.: Deep learning face attributes in the wild. In IEEE ICCV (2015), pp. 3730–3738. 6
- Liu G., Reda F. A., Shih K. J., Wang T.-C., Tao A., Catanzaro B.: Image inpainting for irregular holes using partial convolutions. In ECCV (2018), pp. 85–100. 1, 3, 6, 8, 9
- Long J., Shelhamer E., Darrell T.: Fully convolutional networks for semantic segmentation. In IEEE CVPR (2015), pp. 3431–3440. 2
- Nazeri K., Ng E., Joseph T., Qureshi F. Z., Ebrahimi M.: EdgeConnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019). 1, 3, 4
- Pathak D., Krahenbuhl P., Donahue J., Darrell T., Efros A. A.: Context encoders: Feature learning by inpainting. In IEEE CVPR (2016), pp. 2536–2544. 1, 2, 3, 9
- Ren Y., Yu X., Zhang R., Li T. H., Liu S., Li G.: Structureflow: Image inpainting via structure-aware appearance flow. In IEEE ICCV (2019), pp. 181–190. 1, 3
- Simakov D., Caspi Y., Shechtman E., Irani M.: Summarizing visual data using bidirectional similarity. In IEEE CVPR (2008), pp. 1–8. 2
- Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z.: Rethinking the inception architecture for computer vision. In IEEE CVPR (2016), pp. 2818–2826. 9
- Simonyan K., Zisserman A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). 3
- Ulyanov D., Vedaldi A., Lempitsky V.: Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In IEEE CVPR (2017), pp. 6924–6932. 5
- Wang Z., Bovik A. C., Sheikh H. R., Simoncelli E. P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 4 (2004), 600–612. 8
- Wan Z., Zhang B., Chen D., Zhang P., Chen D., Liao J., Wen F.: Bringing old photos back to life. In IEEE CVPR (2020), pp. 2747–2757. 10
- Xie C., Liu S., Li C., Cheng M.-M., Zuo W., Liu X., Wen S., Ding E.: Image inpainting with learnable bidirectional attention maps. In IEEE ICCV (2019), pp. 8858–8867. 3
- Yu F., Koltun V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015). 3, 5
- Yang C., Lu X., Lin Z., Shechtman E., Wang O., Li H.: High-resolution image inpainting using multi-scale neural patch synthesis. In IEEE CVPR (2017), pp. 6721–6729. 1, 3
- Yan Z., Li X., Li M., Zuo W., Shan S.: Shift-net: Image inpainting via deep feature rearrangement. In ECCV (2018), pp. 3–19. 1
- Yu J., Lin Z., Yang J., Shen X., Lu X., Huang T. S.: Generative image inpainting with contextual attention. In IEEE CVPR (2018), pp. 5505–5514. 1, 2, 3, 6
- Yu J., Lin Z., Yang J., Shen X., Lu X., Huang T. S.: Free-form image inpainting with gated convolution. In IEEE ICCV (2019), pp. 4471–4480. 1, 3, 6, 8, 9
- Yu J.: Deepfill v1/v2 with contextual attention and gated convolution. https://github.com/JiahuiYu/generative_inpainting. Accessed 5 Match 2020., 2019. 6
- Zeng Y.: Official implement for learning pyramid-context encoder network for high-quality image inpainting. https://github.com/researchmm/PEN-Net-for-Inpainting. Accessed 5 Match 2020., 2019. 6
- Zeng Y., Fu J., Chao H., Guo B.: Learning pyramid-context encoder network for high-quality image inpainting. In IEEE CVPR (2019), pp. 1486–1494. 1, 3, 6, 8, 9
- Zhou B., Lapedriza A., Khosla A., Oliva A., Torralba A.: Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 6 (2017), 1452–1464. 6