Customized Summarizations of Visual Data Collections
Mengke Yuan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorBernard Ghanem
CEMSE, King Abdullah University of Science and Technology, Thuwal, 23955-6900 Kingdom of Saudi Arabia
Search for more papers by this authorCorresponding Author
Dong-Ming Yan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorBaoyuan Wu
School of Data Science, Chinese University of Hong Kong, 2001 Longxiang Road, Longgang District, Shenzhen, P. R. China
Secure Computing Lab of Big Data, Shenzhen Research Institute of Big Data, 2001 Longxiang Road, Longgang District, Shenzhen, P. R. China
Tencent AI Lab, Shenzhen, P. R. China
Search for more papers by this authorXiaopeng Zhang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorPeter Wonka
CEMSE, King Abdullah University of Science and Technology, Thuwal, 23955-6900 Kingdom of Saudi Arabia
Search for more papers by this authorMengke Yuan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorBernard Ghanem
CEMSE, King Abdullah University of Science and Technology, Thuwal, 23955-6900 Kingdom of Saudi Arabia
Search for more papers by this authorCorresponding Author
Dong-Ming Yan
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorBaoyuan Wu
School of Data Science, Chinese University of Hong Kong, 2001 Longxiang Road, Longgang District, Shenzhen, P. R. China
Secure Computing Lab of Big Data, Shenzhen Research Institute of Big Data, 2001 Longxiang Road, Longgang District, Shenzhen, P. R. China
Tencent AI Lab, Shenzhen, P. R. China
Search for more papers by this authorXiaopeng Zhang
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, #95 East Zhongguancun Road, Beijing, 100190 P. R. China
School of Artificial Intelligence, University of Chinese Academy of Sciences, No.19(A) Yuquan Road, Shijingshan District, Beijing, 100149 P. R. China
Search for more papers by this authorPeter Wonka
CEMSE, King Abdullah University of Science and Technology, Thuwal, 23955-6900 Kingdom of Saudi Arabia
Search for more papers by this authorAbstract
We propose a framework to generate customized summarizations of visual data collections, such as collections of images, materials, 3D shapes, and 3D scenes. We assume that the elements in the visual data collections can be mapped to a set of vectors in a feature space, in which a fitness score for each element can be defined, and we pose the problem of customized summarizations as selecting a subset of these elements. We first describe the design choices a user should be able to specify for modeling customized summarizations and propose a corresponding user interface. We then formulate the problem as a constrained optimization problem with binary variables and propose a practical and fast algorithm based on the alternating direction method of multipliers (ADMM). Our results show that our problem formulation enables a wide variety of customized summarizations, and that our solver is both significantly faster than state-of-the-art commercial integer programming solvers and produces better solutions than fast relaxation-based solvers.
Supporting Information
Filename | Description |
---|---|
cgf14336-sup-0001-videoS1.mp413.5 MB | Video S1 |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- [AGHI09] Agrawal R., Gollapudi S., Halverson A., Ieong S.: Diversifying search results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (2009), pp. 5–14.
10.1145/1498759.1498766 Google Scholar
- [AKZM14] Averkiou M., Kim V. G., Zheng Y., Mitra N. J.: Shapesynth: Parameterizing model collections for coupled shape exploration and synthesis. Computer Graphics Forum 33, 2 (2014), 125–134.
- [B*13] Bach F.: Learning with submodular functions: A convex optimization perspective. Foundations and Trends® in Machine Learning 6, 2-3 (2013), 145–373.
10.1561/2200000039 Google Scholar
- [Bis06] Bishop C.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer-Verlag, New York, 2006.
10.1007/978-0-387-45528-0 Google Scholar
- [BV04] Boyd S., Vandenberghe L.: Convex Optimization. Cambridge University Press, New York, 2004.
10.1017/CBO9780511804441 Google Scholar
- [BYMW13] Bao F., Yan D.-M., Mitra N. J., Wonka P.: Generating and exploring good building layouts. ACM Transactions on Graphics (Proc. SIGGRAPH) 32, 4 (2013), 122.
- [CFG*15] Chang A. X., Funkhouser T., Guibas L., Hanrahan P., Huang Q., Li Z., Savarese S., Savva M., Song S., Su H., Xiao J., Yi L., Yu F.: ShapeNet: An Information-Rich 3D Model Repository. Tech. Rep. arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chicago, 2015.
- [CG98] Carbonell J., Goldstein J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New York, 1998), SIGIR '98, ACM, ACM, pp. 335–336.
10.1145/290941.291025 Google Scholar
- [CK20] Celis L. E., Keswani V.: Implicit diversity in image summarization. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2 (2020), 1–28.
- [CKC*08] Clarke C. L., Kolla M., Cormack G. V., Vechtomova O., Ashkan A., Büttcher S., MacKinnon I.: Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (2008), pp. 659–666.
- [CLC13] Cadinu M., Latrofa M., Carnaghi A.: Comparing self-stereotyping with in-group-stereotyping and out-group-stereotyping in unequal-status groups: The case of gender. Self and Identity 12, 6 (2013), 582–596.
- [Coo86] Cook R. L.: Stochastic sampling in computer graphics. ACM Transactions on Graphics 5, 1 (1986), 69–78.
- [CSA03] Carr H., Snoeyink J., Axen U.: Computing contour trees in all dimensions. Computational Geometry 24, 2 (2003), 75–94. Special Issue on the Fourth {CGC} Workshop on Computational Geometry.
- [DDS*09] Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L.: Imagenet: A large-scale hierarchical image database. In CVPR (2009), IEEE, pp. 248–255.
- [Den21] Deng J.: Imagenet explore website. http://image-net.org/explore, 2021. Accessed: 2021-01-13.
- [DFG99] Du Q., Faber V., Gunzburger M.: Centroidal Voronoi tessellations: applications and algorithms. SIAM Review 41 (1999), 637–676.
- [DFL*15] Doraiswamy H., Ferreira N., Lage M., Vo H., Wilson L., Werner H., Park M., Silva C.: Topology-based catalogue exploration framework for identifying view-enhanced tower designs. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 34, 6 (2015), 230:1–230:13.
- [DLC*15] Dang M., Lienhard S., Ceylan D., Neubert B., Wonka P., Pauly M.: Interactive design of probability density functions for shape grammars. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 34, 6 (2015), 206:1–206:13.
- [ELPZ94] Eldar Y., Lindenbaum M., Porat M., Zeevi Y.: The farthest point strategy for progressive image sampling. Pattern Recognition 3 (1994), 93–97.
- [GH88] Goldberg D. E., Holland J. H.: Genetic algorithms and machine learning. Machine Learning 3, 2 (1988), 95–99.
10.1023/A:1022602019183 Google Scholar
- [Gur10] Gurobi Optimization, Inc.: Gurobi. Gurobi Optimization, Inc., 2010. www.gurobi.com.
- [HZRS16] He K., Zhang X., Ren S., Sun J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Los Alamitos, CA, 2016), IEEE, pp. 770–778.
10.1109/CVPR.2016.90 Google Scholar
- [Kar21] Karpathy A.: t-sne visualization of cnn codes. https://cs.stanford.edu/people/karpathy/cnnembed/, 2021. Accessed: 2021-01-15.
- [Kel60] Kelley J.: The cutting-plane method for solving convex programs. Journal of the Society for Industrial and Applied Mathematics 8, 4 (1960), 703–712.
- [KLM*12] Kim V. G., Li W., Mitra N. J., DiVerdi S., Funkhouser T. A.: Exploring collections of 3D models using fuzzy correspondences. ACM Transactions on Graphics (Proc. SIGGRAPH) 31, 4 (2012), 54:1–54:11.
- [LD60] Land A. H., Doig A. G.: An automatic method of solving discrete programming problems. Econometrica: Journal of the Econometric Society 1, 1 (1960), 497–520.
- [Li21] Li D.: Face attribute prediction. https://github.com/d-li14/face-attribute-prediction, 2021. Accessed: 2021-01-15.
- [LLWT15] Liu Z., Luo P., Wang X., Tang X.: Deep learning face attributes in the wild. In ICCV (December 2015).
- [LSN*14] Lienhard S., Specht M., Neubert B., Pauly M., Müller P.: Thumbnail galleries for procedural models. Computer Graphics Forum 33, 2 (2014), 361–370.
- [LWKZ19] Luo Y., Wong Y., Kankanhalli M., Zhao Q.: Direction concentration learning: Enhancing congruency in machine learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019), 1–1. https://doi.org/10.1109/TPAMI.2019.2963387.
- [MAB*97] Marks J., Andalman B., Beardsley P. A., Freeman W., Gibson S., Hodgins J., Kang T., Mirtich B., Pfister H., Ruml W., Ryall K., Seims J., Shieber S.: Design galleries: A general approach to setting parameters for computer graphics and animation. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (New York, 1997), Proc. ACM SIGGRAPH, ACM, pp. 389–400.
10.1145/258734.258887 Google Scholar
- [MH08] Maaten L. v. d., Hinton G.: Visualizing data using t-sne. Journal of Machine Learning Research 9, Nov (2008), 2579–2605.
- [MOS17] MOSEK ApS: The MOSEK optimization toolbox for MATLAB manual. Version 8.1. MOSEK ApS, 2017. http://docs.mosek.com/8.1/toolbox/index.html.
- [MSL*11] Merrell P., Schkufza E., Li Z., Agrawala M., Koltun V.: Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics (Proc. SIGGRAPH) 30, 4 (2011), 87.
- [oE19] of Engineering R. A.: Ai reveals misrepresentation of engineers online. https://www.raeng.org.uk/news/news-releases/2019/november/ai-reveals-misrepresentation-of-engineers-online, 2019. Accessed: 2021-06-13.
- [PVZJ12] Parkhi O. M., Vedaldi A., Zisserman A., Jawahar C. V.: Cats and dogs. In IEEE Conference on Computer Vision and Pattern Recognition (CA, 2012), vol. 1, IEEE, pp. 3498–3505.
10.1109/CVPR.2012.6248092 Google Scholar
- [Ras04] Rasmussen C. E.: Gaussian processes in machine learning. In Advanced Lectures on Machine Learning. Springer, 2004, pp. 63–71.
10.1007/978-3-540-28650-9_4 Google Scholar
- [RRT94] Ravi S. S., Rosenkrantz D. J., Tayi G. K.: Heuristic and special case algorithms for dispersion problems. Operations Research 42, 2 (1994), 299–310.
- [SBL20] Schubiger M., Banjac G., Lygeros J.: Gpu acceleration of admm for large-scale quadratic programming. Journal of Parallel and Distributed Computing (2020).
- [SHD11] Schlömer T., Heck D., Deussen O.: Farthest-point optimized point sets with maximized minimum distance. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics (New York, 2011), ACM, pp. 135–142.
10.1145/2018323.2018345 Google Scholar
- [SSCO09] Shapira L., Shamir A., Cohen-Or D.: Image appearance exploration by model-based navigation. Computer Graphics Forum 28, 2 (2009), 629–638.
- [SSS07] Simon I., Snavely N., Seitz S. M.: Scene summarization for online image collections. In Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on (CA, 2007), IEEE, IEEE, pp. 1–8.
10.1109/ICCV.2007.4408863 Google Scholar
- [TLL*11] Talton J. O., Lou Y., Lesser S., Duke J., Měch R., Koltun V.: Metropolis procedural modeling. ACM Transactions on Graphics 30, 2 (2011), 11.
- [TM18] Talebi H., Milanfar P.: Nima: Neural image assessment. IEEE Transactions on Image Processing 27, 8 (2018), 3998–4011.
- [UIM12] Umetani N., Igarashi T., Mitra N. J.: Guided exploration of physically valid shapes for furniture design. ACM Transactions on Graphics (Proc. SIGGRAPH) 31, 4 (2012), 86:1–86:9.
- [VC11] Vargas S., Castells P.: Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the Fifth ACM Conference on Recommender Systems (2011), pp. 109–116.
10.1145/2043932.2043955 Google Scholar
- [VGDA*12] Vanegas C. A., Garcia-Dorado I., Aliaga D. G., Benes B., Waddell P.: Inverse design of urban procedural models. ACM Trans. on Graphics (Proc. SIGGRAPH Asia) 31, 6 (2012), 168:1–168:11.
- [Vic18] Victor G.: Maxwell materials. http://www.maxwellrender.com/materials/, 2018. Accessed: 2018-04-30.
- [vL07] vonLuxburg U.: A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395–416.
- [WG18] Wu B., Ghanem B.:
-box admm: A versatile framework for integer programming. IEEE Trans. Pattern Anal. Mach. Intell. (2018).
- [WSvdHT17] Wang P., Shen C., van den Hengel A., Torr P. H.: Large-scale binary quadratic optimization using semidefinite relaxation and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 3 (2017), 470–485.
- [XZCOC12] Xu K., Zhang H., Cohen-Or D., Chen B.: Fit and diverse: Set evolution for inspiring 3d shape galleries. ACM Transactions on Graphics (Proc. SIGGRAPH) 31, 4 (2012), 57:1–57:10.