Robust image hash functions using higher order spectra

Chen, Brenden Chong (2012) Robust image hash functions using higher order spectra. PhD thesis, Queensland University of Technology.


Robust hashing is an emerging field that can be used to hash certain data types in applications unsuitable for traditional cryptographic hashing methods. Traditional hashing functions have been used extensively for data/message integrity, data/message authentication, efficient file identification and password verification.

These applications are possible because the hashing process is compressive, allowing for efficient comparisons in the hash domain but non-invertible meaning hashes can be used without revealing the original data. These techniques were developed with deterministic (non-changing) inputs such as files and passwords.

For such data types a 1-bit or one character change can be significant, as a result the hashing process is sensitive to any change in the input. Unfortunately, there are certain applications where input data are not perfectly deterministic and minor changes cannot be avoided. Digital images and biometric features are two types of data where such changes exist but do not alter the meaning or appearance of the input. For such data types cryptographic hash functions cannot be usefully applied.

In light of this, robust hashing has been developed as an alternative to cryptographic hashing and is designed to be robust to minor changes in the input.

Although similar in name, robust hashing is fundamentally different from cryptographic hashing. Current robust hashing techniques are not based on cryptographic methods, but instead on pattern recognition techniques. Modern robust hashing algorithms consist of feature extraction followed by a randomization stage that introduces non-invertibility and compression, followed by quantization and binary encoding to produce a binary hash output. In order to preserve robustness of the extracted features, most randomization methods are linear and this is detrimental to the security aspects required of hash functions. Furthermore, the quantization and encoding stages used to binarize real-valued features requires the learning of appropriate quantization thresholds. How these thresholds are learnt has an important effect on hashing accuracy and the mere presence of such thresholds are a source of information leakage that can reduce hashing security.

This dissertation outlines a systematic investigation of the quantization and encoding stages of robust hash functions. While existing literature has focused on the importance of quantization scheme, this research is the first to emphasise the importance of the quantizer training on both hashing accuracy and hashing security. The quantizer training process is presented in a statistical framework which allows a theoretical analysis of the effects of quantizer training on hashing performance. This is experimentally verified using a number of baseline robust image hashing algorithms over a large database of real world images.

This dissertation also proposes a new randomization method for robust image hashing based on Higher Order Spectra (HOS) and Radon projections. The method is non-linear and this is an essential requirement for non-invertibility.

The method is also designed to produce features more suited for quantization and encoding. The system can operate without the need for quantizer training, is more easily encoded and displays improved hashing performance when compared to existing robust image hashing algorithms. The dissertation also shows how the HOS method can be adapted to work with biometric features obtained from 2D and 3D face images.

Impact and interest:

Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

243 since deposited on 02 Jul 2013
25 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 61087
Item Type: QUT Thesis (PhD)
Supervisor: Chandran, Vinod
Keywords: robust hashing, image hashing, higher order spectra, bispectrum, adaptive deterministic quantization, random projection, biometric template security, nonnegative matrix factorization, Fourier-Mellin, Gray code, distance distortion
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Institution: Queensland University of Technology
Deposited On: 02 Jul 2013 22:47
Last Modified: 09 Sep 2015 04:53

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page