Publications
Papers to cite

Listed below are various publications relating to algorithms in the LMW-tree library and their application and evaluation. Please cite these papers when referring to the use of any of the software available by the K-tree project. If you would like to view presentations, including video recordings, please see the presentations page.

Algorithms and Data Structures
Publications about algorithms and data structures based upon the m-way nearest neighbor search tree

[PDF] De Vries, C.M., De Vine, L., Geva, S.: The EM-tree Algorithm. In: PhD thesis "Document Clustering Algorithms, Representations and Evaluation for Information Retrieval" (2014) page 168

[PDF] De Vries, C.M., De Vine, L., Geva, S.: Random Indexing K-tree. In: ADCS09: Australian Document Computing Symposium 2009, Sydney, Australia, December 4 (2009) 43-50

[PDF] Geva, S.: K-tree: a height balanced tree structured vector quantizer. Proceedings of the 2000 IEEE Signal Processing Society Workshop Neural Networks for Signal Processing X, 2000. 1 (2000) 271-280 vol.1

Document Representations
Publications about document representations for learning and search

[PDF] De Vries, C.M. and Geva, S.: Pairwise similarity of TopSig document signatures. In Australasian Document Computing Symposium 2012, 5-6 December 2012, University of Otago, Dunedin.

[PDF] Geva, S. and De Vries, C.M.: TopSig: Topology Preserving Document Signatures. In: Conference on Information and Knowledge Management 2011, 24-28 October 2011, Glasgow, Scotland.

Information Retrieval
Publications about cluster based search engines

[PDF] De Vries, C.M., Geva, S., and, Trotman, A.: Distributed Information Retrieval: Collection Distribution, Selection and the Cluster Hypothesis for Evaluation of Document Clustering. In: PhD thesis "Document Clustering Algorithms, Representations and Evaluation for Information Retrieval" (2014) page 129

Clustering Evaluation
Publications on metrics, measures and evaluation tasks to compare different approaches to clustering

[PDF] Reuter, T., Papadopoulos, S., Petkos, G., Mezaris, V., Kompatsiaris, V., Cimiano, P., De Vries, C.M., Geva, S. (2013) Social event detection at MediaEval 2013 : challenges, datasets and evaluation. In Eslevich, M. and van Laere, O. (Eds.) Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, MediaEval Multimedia Benchmark , Barcelona, Spain, pp. 1-2.

[PDF] De Vries, C.M., Geva, S., and, Trotman, A. Document clustering evaluation : Divergence from a random baseline. In Workshop "Information Retrieval 2012" (IR-2012), 12-14 September, 2012, Technical University of Dortmund, Dortmund, Germany.

[PDF] De Vries, C.M., Nayak, R., Kutty, S., Geva, S., Tagarelli, A.: Overview of the INEX 2010 XML mining track: clustering and classification of XML documents. In Lecture Notes in Computer Science, INEX 2010, Amsterdam (2011) In Press

[PDF] Nayak, R., De Vries, C.M., Kutty, S., Geva, S., Denoyer, L., Gallinari, P.: Overview of the INEX 2009 XML mining track: clustering and classification of XML documents. In Focused Retrieval and Evaluation : Proceedings of 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Queensland (2010) 366-378

Document Clustering and Classification
Publications applying these family of algorithms to document clustering and classification

[PDF] De Vries, C.M., De Vine, L., Geva, S.: Clustering with Random Indexing K-tree and XML structure. In: Focused Retrieval and Evaluation: Proceedings of 8th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2009, Brisbane, Queensland (2010) 366-378

[PDF] De Vries, C.M., Geva, S.: K-tree: large scale document clustering. In: SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, Boston, MA, USA, ACM (2009) 718-719

[PDF] De Vries, C.M., Geva, S.: Document Clustering with K-tree. In: Advances in Focused Retrieval: 7th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2008, Schloss Dagstuhl, Germany, December 15-18 (2009) 420-431

Software
Publications on implementations of software related to clustering

[PDF] De Vries, C.M., Geva, S.: ClusterEval 1.0 : Cluster quality Evaluation software. (2013)

[PDF] Großekathöfer, U., De Vries, C.M., Geva, S.: pyktree: a K-tree implementation in Python (2011)

Theses
Combinations of the papers above presented as Masters and PhD theses.

[PDF] De Vries, C.M.: Document clustering Algorithms, Representations and Evaluation for Information Retrieval. PhD by Publication Thesis, QUT. (2014)

[PDF] De Vries, C.M.: Application of K-tree to Document Clustering. Masters by Research Thesis, QUT. (2010)