Publications
CONFERENCE (INTERNATIONAL) Audio Fingerprinting with Holographic Reduced Representations
Yusuke Fujita, Tatsuya Komatsu
The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)
September 01, 2024
This paper proposes an audio fingerprinting model with holographic reduced representation (HRR). The proposed method reduces the number of stored fingerprints, whereas conventional neural audio fingerprinting requires many fingerprints for each audio track to achieve high accuracy and time resolution. We utilize HRR to aggregate multiple fingerprints into a composite fingerprint via circular convolution and summation, resulting in fewer fingerprints with the same dimensional space as the original. Our search method efficiently finds a combined fingerprint in which a query fingerprint exists. Using HRR's inverse operation, it can recover the relative position within a combined fingerprint, retaining the original time resolution. Experiments show that our method can reduce the number of fingerprints with modest accuracy degradation while maintaining the time resolution, outperforming simple decimation and summation-based aggregation methods.
Paper : Audio Fingerprinting with Holographic Reduced Representations (external link)