Videntifier vs. TMK, DML, and CNNL
The world is rife with video content - the digital world, that is. And with this massive saturation of visual communication inundating both our screens and the servers which feed them, it's helpful to have a method of keeping the bad stuff at bay.
Big companies like Facebook (Meta) have invested in building technologies that can quickly identify videos and other visual media circulating the web. This technology often takes the form of complex algorithms, designed to detect whether two images are alike (and if they aren't, then how much different are they?).
However, video detection technologies are seldom created equal, and that's why we put together a test to see how other visual detection solutions measure against our own. Here we hope to shed some light on the state of these algorithms as they stand today, and make some improvements to our own technology where needed.
Video Recall Tests
In order to test Videntifier's video identification capabilities, a comparison was made using two publicly available datasets, VCDB and SVD.
VCDB is a Large-Scale Database for Partial Copy Detection in Videos. The dataset consists of 528 videos (approximately 27 hours), collected using 28 carefully selected queries from YouTube and MetaCafe. Major transformations between the copies include insertion of patterns, camcording, scale change, picture in picture, etc. For further information about VCDB, see http://www.yugangjiang.info/research/VCDB/index.html
SVD is a large-scale short video dataset, which contains over 500,000 short videos collected from http://www.douyin.com and over 30,000 labeled pairs of near-duplicate videos. The purpose of this dataset is to analyze the performance of various video identification technologies for short videos. For further information about SVD, see https://svdbase.github.io.
We compared Videntifier against an open source solution, TMK (Temporal Match Kernel), which is currently available in the market. TMK is a Facebook-released algorithm based on the pHash technology. For the SVD test we include published results for the DML and CNNL methods that are both based on deep learning.
For evaluation metrics we use mean average precision (MAP).
VCDP Test
Method | MAP |
Videntifier(1) | 83.24 |
TMK 1(2) | 20.50 |
TMK 2(3) | 15.20 |
SVD Test
The SVD test is performed using an unmodified query video as well as transformed versions of the query video. The transformations applied are speeding, cropping, black border insertion and rotation
Method | Original | Cropping | Black Border | Rotation | Speeding |
Videntifier | 90.64 | 86.22 | 88.60 | 89.81 | 89.62 |
TMK 1(2) | 47.72 | .34 | .05 | .02 | 47.48 |
TMK 2(3) | 27.71 | .09 | .01 | .00 | 27.81 |
DML | 78.47 | 54.07 | 68.17 | 15.59 | 76.70 |
CNNL | 55.55 | 15.61 | 18.63 | .15 | 51.80 |
(1) Using high quality recall settings (2) Using a comparison where each query video is compared to the en<re match dataset. Using this method, query time will increase linearly as the match dataset grows making in impractical for large datasets (3) Using the FAISS similarity search library that provides consistent query time regardless of the match dataset size. This method is recommended for TMK
Comments