As a contribute to the scientific community working on the field of entity annotation, we developed a framework to compare text annotators: systems that, given a text document, aim at finding the entities the text is about, identified as Wikipedia pages. The BAT-Framework, written in Java, comes along with a formal framework that defines a set of problems, the way systems can be compared to each other, and a set of measures that – extending classic IR measures – fairly and fully compares entity annotators features. 

Bicriteria Data Compression (BcZip)

Bicriteria Data Compression is a novel compression paradigm which allows the user to trade decompression time and compressed size in a principled way. Shortly, the tool lets you specify a bound on the decompression time (say, 800 msecs), and compresses the file in such a way that the decompression time is below that time-bound and compressed size is minimized (or vice-versa).

Code is available here.