Fast Probabilistic File Fingerprinting for Big Data
Konstantin Tretyakov*1,
Sven Laur1,
Geert Smant2,
Jaak Vilo1 and
Pjotr Prins*2,3
1 Institute of Computer Science, University of Tartu, Estonia
2 Laboratory of Nematology, Wageningen Agricultural University, The Netherlands
3 Groningen Bioinformatics Center, University of Groningen, The Netherlands
* Corresponding author
Download article
The article is publicly available at the BMC Genomics website:
Presentation slides
Slides of the presentation made at ISCB-Asia 2012.
The PFFF tool
PFFF is an open-source command-line tool for performing file fingerprinting. Note that it is not always applicable nor faster than plain MD5 (see the paper), but there are particular contexts where it may provide significant winnings.
- Binaries
- Source code
- The source code is hosted on github. Feel free to fork.
Supplementary text and files
Page last updated: 01.12.2012