PhD Thesis: "Content-based Video Indexing for Sports Applications using Multi-modal approach"

Tjondronegoro, Dian W. (2005) PhD Thesis: "Content-based Video Indexing for Sports Applications using Multi-modal approach". Deakin University.



Triggered by technology innovations, there has been a huge increase in the utilization of video, as one of the most preferred types of media due to its content richness, for many significant applications. To sustain an ongoing rapid growth of video information, there is an emerging demand for a sophisticated content-based video indexing system. However, current video indexing solutions are still immature and lack of any standard. One solution, namely annotation-based indexing, allows video retrieval using textual annotations. However, the major limitations are the restrictions of pre-defined keywords that can be used and the expensive manual work on annotating video. Another solution called feature-based indexing allows video search by low-level features comparison such as query by a sample image. Even though this approach can use automatically extracted features, users would not be able to retrieve video intuitively, based on high-level concepts. This predicament is caused by the so-called semantic gap which highlights the fact that users recall video contents in a high-level abstraction while video is generally stored as an arbitrary sequence of audio-visual tracks.

To bridge the semantic gap, this thesis will demonstrate the use of domain-specific approach which aims to utilize domain knowledge in facilitating the extraction of high-level concepts directly from the audiovisual features. The main idea behind domain-specific approach is the use of domain knowledge to guide the integration of features from multi-modal tracks. For example, to extract goal segments from soccer and basketball video, slow motion replay scenes (visual) and excitement (audio) should be detected as they are played during most goal segments. Domain-specific indexing also exploits specific browsing and querying methods which are driven by specific users/applications’ requirements. Sports video is selected as the primary domain due to its content richness and popularity. Moreover, broadcasted sports videos generally span for hours with many redundant activities and the key segments could make up only 30% to 60% of the entire data depending on the progress of the match.

This thesis presents a research work based on an integrated multi-modal approach for sports video indexing and retrieval. By combining specific features extractable from multiple (audio-visual) modalities, generic structure and specific events can be detected and classified. During browsing and retrieval, users will benefit from the integration of high-level semantic and some descriptive mid-level features such as whistle and close-up view of player(s). The main objective is to contribute to the three major components of sports video indexing systems. The first component is a set of powerful techniques to extract audio-visual features and semantic contents automatically. The main purposes are to reduce manual annotations and to summarize the lengthy contents into a compact, meaningful and more enjoyable presentation. The second component is an expressive and flexible indexing technique that supports gradual index construction. Indexing scheme is essential to determine the methods by which users can access a video database. The third and last component is a query language that can generate dynamic video summaries for smart browsing and support user-oriented retrievals.

Impact and interest:

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

2,762 since deposited on 10 Nov 2005
43 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 2199
Item Type: Thesis
Refereed: No
Additional Information: This is a Deakin University thesis. However, the author is currently a staff member at QUT and wishes to make his thesis available via this eprint archive until the thesis appears in the Australian Digitial Thesis collection (ADT). For more information about ADT, see link above.
Additional URLs:
Keywords: video database, multimedia, information retrieval, XML, MPEG, 7, XQuery, multi, modal, analysis, indexing, sports applications
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > LIBRARY AND INFORMATION STUDIES (080700) > Information Retrieval and Web Search (080704)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING (080100) > Computer Vision (080104)
Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > COMPUTER SOFTWARE (080300) > Multimedia Programming (080305)
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Department: .
Institution: Deakin University
Copyright Owner: Copyright 2005 Dian Tjondronegoro
Deposited On: 10 Nov 2005 00:00
Last Modified: 09 Jun 2010 12:27

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page