
WORKING WITH VIDEOS
Azizbek Ruzmetov,Yetmishboyev Shakhzodbek , Kimyo International University in TashkentAbstract
This article presents a number of ways to work with video images, such as choosing a suitable title, subtitle and object. Also, within the framework of this topic, researches of many researchers are studied, suggestions and recommendations are given to users.
Keywords
Video image, Artificial intelligence, subtitle, object, graphic.
References
Liang, J.; Deng, Y.; Zeng, D. A deep neural network combined CNN and GCN for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.; 2020; 13, pp. 4325-4338. [DOI: https://dx.doi.org/10.1109/JSTARS.2020.3011333]
Chaudhuri, U.; Banerjee, B.; Bhattacharya, A.; Datcu, M. Attention-driven graph convolution network for remote sensing image retrieval. IEEE Geosci. Remote Sens. Lett.; 2021; 19, 8019705. [DOI: https://dx.doi.org/10.1109/LGRS.2021.3105448]
Ma, C.; Zeng, S.; Li, D. Image restoration and enhancement in monitoring systems. Proceedings of the 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS); Vientiane, Laos, 11–12 January 2020; pp. 753-760.
Zhang W, Tang S, Su J, Xiao J, Zhuang Y Tell and guess: cooperative learning for natural image caption generation with hierarchical refined attention. Multimed Tools Appl. 2021; 80: 16267-16282.
Cheng C, Li C, Han Y, Zhu Y A semi-supervised deep learning image caption model based on pseudo label and n-gram. Int J Approx Reason. 2021; 131: 93-107.
Lecun Y, Bottou L, Bengio Y, Haffner P Gradient-based learning applied to document recognition. Proc IEEE. 1998; 86: 2278-2324. doi:10.1109/5.726791
Dupond S. A thorough review on the current advance of neural net- work structures. Ann Rev Control. 2019; 14: 200-230.
Ehrlinger L, Wöß W Towards a definition of knowledge graphs. Paper presented at: SEMANTiCS (Posters, Demos, SuCCESS). 2016.
Bai S, An S. A survey on automatic image caption generation. Neurocomputing. 2018; 311: 291-304.
Wajid MA, Zafar A Multimodal information access and retrieval notable work and milestones. Paper presented at: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE. 2019:1-6.
Smeaton AF, Quigley I Experiments on using semantic distances between words in image caption retrieval. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 1996:174-180.
Li K, Zhang Y, Li K, Li Y, Fu Y Visual semantic reasoning for image-text matching. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019:4654-4662.
Chen S, Yao T, Jiang Y-G Deep learning for video captioning: a review. Paper presented at: IJCAI. 2019.
Kojima A, Tamura T, Fukunaga K Natural language description of human activities from video images based on concept hierarchy of actions. Int J Comput Vis. 2002; 50: 171-184.
Guadarrama S, Krishnamoorthy N, Malkarnenkar G, et al. Youtube2text: recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition. Paper presented at: 2013 IEEE International Conference on Computer Vision. 2013:2712-2719.
Pei W, Zhang J, Wang X, Ke L, Shen X, Tai Y-W Memory-attended recurrent network for video captioning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:8347-8356.
Li LH, Zhang P, Zhang H, et al. Grounded language-image pre-training. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022:10965-10975.
Li X, Yuan A, Lu X Vision-to-language tasks based on attributes and attention mechanism. IEEE Trans Cybern. 2019; 51: 913-926.
Ding S, Qu S, Xi Y, Sangaiah AK, Wan S Image caption generation with high-level image features. Pattern Recogn Lett. 2019; 123: 89-95.
Ishtiaque S, Wajid MS A review on medical image compression techniques. Int J Digit Appl Contemp Res 2017:17.
Chen X, Lawrence Zitnick C Mind's eye: a recurrent visual representation for image caption generation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015:2422-2431.
Rashtchian C, Young P, Hodosh M, Hockenmaier J Collecting image annotations using amazon's mechanical turk. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. 2010:139-147.
Lin T-Y, Maire M, Belongie S, et al. Microsoft coco: common objects in context. Paper presented at: European Conference on Computer Vision, Springer. 2014:740-755.
Luo H, Ji L, Zhong M, et al. Clip4clip: An empirical study of clip for end-to-end video clip retrieval, arXiv preprint arXiv:2104.08860. 2021.
Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16 × 16 words: transformers for image recognition at scale, arXiv preprint arXiv:2010.11929. 2020.
Zhou, Y; Zheng, H; Kravchenko, II; Valentine, J. Flat optics for image differentiation. Nat. Photonics; 2020; 14, pp. 316-323. [DOI: https://dx.doi.org/10.1038/s41566-020-0591-3]
Zhu, T et al. Plasmonic computing of spatial differentiation. Nat. Commun.; 2017; 8, [DOI: https://dx.doi.org/10.1038/ncomms15391]
H. Ji, C. Q. Liu, Motion blur identification from image gradients, CVPR (2008).
Davis, JA; McNamara, DE; Cottrell, DM. Analysis of the fractional Hilbert transform. Appl. Opt.; 1998; 37, pp. 6911-6913. [DOI: https://dx.doi.org/10.1364/AO.37.006911]
Tan, M et al. Highly versatile broadband RF photonic fractional hilbert transformer based on a Kerr soliton crystal microcomb. J. Light. Technol.;2021;39,pp.75817587.[DOI:https://dx.doi.org/10.1109/JLT.2021.3101816]
Capmany, J et al. Microwave photonic signal processing. J. Light. Technol.;2013;31,pp.571586.[DOI:https://dx.doi.org/10.1109/JLT.2012.2222348]
Yang, T et al. Experimental observation of optical differentiation and optical Hilbert transformation using a single SOI microdisk chip. Scie. Rep.; 2014; 4, [DOI: https://dx.doi.org/10.1038/srep03960]
Article Statistics
Downloads
Copyright License

This work is licensed under a Creative Commons Attribution 4.0 International License.