How far away is massive video retrieval from security surveillance?
Embossed Clear Protective Tape Embossed Clear Protective Tape,Transparent Adhesive Tape,Protective Tape Fenghua Jade Motor Co., Ltd. , http://www.flagtapes.com
On June 16, 2011, a young woman riding a EV in Nanjing was stricken by a lorry. After 22 days, she analyzed the surveillance videos of schools, shopping malls, and Internet cafes near the accident site, and retrieved the nearby neighborhoods. 5 After more than one thousand surveillance photos were compared, the police locked the vehicle in trouble; on July 13, 2011, when Mr. Fuzhou took a taxi, he left his LV travel bag on a taxi with 21,000 yuan in cash. After receiving the alarm from the police station, the police station found Mr. Ho’s taxi on the video and helped him retrieve the lost designer travel bag on the 15th. On the morning of July 27, 2011, a driver of Zunyi ran away after crashing into a pedestrian. Police captured video surveillance video along the way, and in the 144-hour video, through the search, analysis and judgment, locked escape vehicles ... just over a month, many cases of detection are using surveillance video, visible search monitoring Video has become an indispensable means for the police to solve the case.
With the advancement of projects such as the safe city, surveillance cameras have spread all over the streets, which left image data for most cases and brought great convenience to the police. However, having related videos does not mean finding the target information. Finding videos and analyzing videos often consumes a lot of time and manpower from the police. Can we find relevant information more easily and more easily in massive video? This is pending further development of video retrieval technology.
The current development of video retrieval technology retrieval technology stems from the needs of the development of the Internet. Searching based on text index is the most mature information retrieval technology today. Various search engines, such as Baidu, Google, Bing, and Yahoo, are based on this technology. With the continuous increase of network bandwidth, people can more quickly and easily share various multimedia information they have collected, or interact with multimedia information, and more and more information is displayed on the Internet through video and other multimedia forms. Requiring higher and higher requirements for multimedia information retrieval techniques represented by images and videos. In the early 1990s, the search for video was started internationally. Different from text information retrieval, the retrieval of image and video is based on the analysis of image and video content, so it is often referred to as content-based image and video retrieval. In 1992, the term "content-based video retrieval" was used. For more than a decade, video data has achieved major theoretical breakthroughs and technological advancements in acquisition, storage, operation, and transmission technologies.
Content-based video retrieval technology is for unstructured data such as audio and video. It uses techniques such as video segmentation, automatic digitization, speech recognition, shot detection, key frame extraction, automatic content association, and video structuring, with image processing and mode. Based on knowledge in the fields of identification, computer vision, and image understanding, new media data representations and data models have been introduced from the fields of cognitive science, artificial intelligence, database management systems, human-computer interaction, and information retrieval, to design reliable and effective data. The search algorithm, system structure and friendly man-machine interface.
Based on the principle of content video retrieval technology, video data can be divided into four hierarchical structures in order from coarse to fine: Video, Scene, Shot, and Frame. Because the variation between adjacent frames within a shot is not large, the difference in characteristics between them is limited to a certain threshold. In the case of sudden changes in the shot, two adjacent frames before and after the mutation point are displayed in a large amount in the content. If the characteristic difference exceeds a given threshold, it means that there is a segmentation boundary. The key frame of the camera is the frame image that reflects the main information content in the camera. After each shot is detected, keyframes can be extracted for each shot, and shots can be succinctly expressed with keyframes. The determination of the number of keyframes is an important issue in the extraction of keyframes. The method of determining the keyframes can be based on the differences in the frames within the shots to calculate the variance, and use the variance to measure the complexity of the lens visual content. The larger the variance, the more keyframes the camera extracts.
The search for videos from content includes the following features: first, extracting information clues directly from media content; second, content-based search is an approximate match, which is obviously consistent with the conventional database search exact match method. Different; third, dynamic feature extraction and indexing can be automatically implemented by computers, which avoids the subjectivity of manual description and greatly reduces the workload. Based on content retrieval, the media features that perform similarity matching retrieval based on media features include: color, texture, outline, shape, space constraint, dynamics, concept, structure description, and other image information.