MIT
Active
TS Corpus Word List
slug: ts-corpus-word-list
Size
7.40 MB
Downloads
36
Created
2025-09-26
Checksum (SHA256)
d8750a3ba106f484f72c4bc90b77da20d007b1bbfa4eb0d6ba128ca5c1fa3952
Description
The TS Corpus Word List is a large-scale lexical dataset containing 3,386,314 Turkish word forms, including roots, derived forms, and inflected variants. It has been systematically compiled from multiple corpora within the TS Corpus project, ensuring representation of actively used vocabulary in contemporary Turkish. This dataset provides a valuable resource for linguistic research, natural language processing, and lexicographic studies, offering comprehensive coverage of word formation processes and usage patterns in the Turkish language.
Annotations
Task
lexicon
Language
tr