Publications

カンファレンス (国際) Audio Fingerprinting with Holographic Reduced Representations

Yusuke Fujita, Tatsuya Komatsu

The 25th Annual Conference of the International Speech Communication Association (INTERSPEECH 2024)

2024.9.1

This paper proposes an audio fingerprinting model with holographic reduced representation (HRR). The proposed method reduces the number of stored fingerprints, whereas conventional neural audio fingerprinting requires many fingerprints for each audio track to achieve high accuracy and time resolution. We utilize HRR to aggregate multiple fingerprints into a composite fingerprint via circular convolution and summation, resulting in fewer fingerprints with the same dimensional space as the original. Our search method efficiently finds a combined fingerprint in which a query fingerprint exists. Using HRR's inverse operation, it can recover the relative position within a combined fingerprint, retaining the original time resolution. Experiments show that our method can reduce the number of fingerprints with modest accuracy degradation while maintaining the time resolution, outperforming simple decimation and summation-based aggregation methods.

Paper : Audio Fingerprinting with Holographic Reduced Representations新しいタブまたはウィンドウで開く (外部サイト)