.lemmafiles from TreeTagger output, which means that the user does not have to prepare every text (s)he wants to analyse according to the 4-column TRACER format.
.taggedas a suffix, e.g.,:
author-title.taggedThe preferred file-naming convention is
00-author-title_of_work.tagged. The naming convention helps TRACER generate the
.txtfile to be used for the analysis (see below). If there are multiple works to analyse, the numbering of the files should be sequential (e.g.,
02-, etc.); digits, author and title should be separated by hyphens, and any white-space in the work title must be represented with an underscore.
.taggedfiles must be deposited in a tagged folder under TRACER's
tracer_config.xmlfile, users must change the
strDataDirectorypath to the corresponding directory:
.txtfile containing all of the texts for TRACER to study (already tokenised by sentence and sequentially ID'd), a
.lemma-listfile containing all of the unique lemmas of all the texts under study, and a
.lemmafile containing all of the lemmas+pos tags of all texts under study. TreeTagger uses a pipe to indicate lemma ambiguity, e.g.,
wordForm PoS lemma1 | lemma2. In these cases, the TRACER heuristic picks the most likely option based on the frequency in the corpus of
lemma2and on the number of incoming links from the inflected word form (
tracer_config.xmlfile, users add the paths to the
.lemmafiles generated by TRACER.