In broadcasters’ archives and multimedia databases around the world lies dormant a wealth of footage that remains hidden since nobody knows exactly what is inside. That is because it is very difficult to efficiently index and search the multiple content dimensions represented in video material: sound, speech, places, persons, specific imagery, camera perspective, written language, environment, time of day or year, occurrence during the video's running time, etc.
Accordingly, there is a need for diligent annotation (or tagging) of video content. This task is usually performed by specialised archivists and librarians, or crowdsourced. But human annotation has its drawbacks: it takes a lot of time and expertise when done well, which in turn translates into either high costs or questionable reliability.
CASAM's answer to this problem is to combine semantic computer technology and selective requests for human input. First, the system automatically analyses video and intelligently provides detailed and systematic descriptions for all content dimensions. Whenever the computer is unable to decide, it asks a human operator for specific feedback. In such a way, the system constantly learns by itself and expands and upgrades its capabilities, while speeding up and facilitating the annotation process considerably.
CASAM R&D was co-funded by the European Commission under the Seventh Framework Programme for Research and organised by Intrasoft International, in cooperation with the Athens Technology Centre, the National Centre for Scientific Research “Demokritos”, Deutsche Welle, the Technical University of Hamburg-Harburg, and the University of Birmingham.
Related news :