tdd TDD
MIT Active

TS Corpus Word List

slug: ts-corpus-word-list

Size
7.40 MB
Downloads
36
Created
2025-09-26
Checksum (SHA256)
d8750a3ba106f484f72c4bc90b77da20d007b1bbfa4eb0d6ba128ca5c1fa3952

Description

The TS Corpus Word List is a large-scale lexical dataset containing 3,386,314 Turkish word forms, including roots, derived forms, and inflected variants. It has been systematically compiled from multiple corpora within the TS Corpus project, ensuring representation of actively used vocabulary in contemporary Turkish. This dataset provides a valuable resource for linguistic research, natural language processing, and lexicographic studies, offering comprehensive coverage of word formation processes and usage patterns in the Turkish language.

Annotations

Task
lexicon
Language
tr