Author: 
TTC project (http://www.ttc-project.eu/)
Description: 
Part-of-speech tagged and lemmatized corpora on wind power in 7 languages: English, French, German, Spanish, Russian, Latvian and Chinese.
Resource type: 
corpus
Resource availability: 
available for commercial use
available for research purposes
free
Can the resource be directly downloaded?: 
Yes
Modality: 
text
Format: 
Production date: 
2012
Domain: 
Format explanation: 
*.txt files containing clean text, *.xml files containing the DublinCore metadata of the previous text files, *.xmi files containing tagged and lemmatized corpus in XMI format compliant with the UIMA Type System for TTC TermSuite, *.tsv files containing tagged and lemmatized corpus in TSV format (tabulated-separated values) i.e. one word per line, 3 columns per word (the word itself, its category and its lemma)