Text Tools

From Digital Sinology
Revision as of 08:03, 22 June 2018 by Dsturgeon (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Text reuse in the Mozi

Text Tools is an online platform providing a range of textual analysis and visualization functionality for arbitrary textual materials. Analysis and textual manipulation tools include computation of n-gram statistics, regular expression search and replace, text reuse identification, document similarity calculation, text comparison by edit distance, and textual transformations (such as tokenization), which are performed through external user-configurable services via an open API. Visualizations available directly in the tool include network graphs, charts, word clouds, and a variety of heat-maps used for visualizing text reuse and document similarity for both close and distant reading.

Designed as a browser-based platform, Text Tools works with a corpus of texts loaded into the tool by the user. These can be loaded directly from the Chinese Text Project via API, or imported into the platform by drag and drop of text files (or zipped text files) from the user's computer. As all processing is performed on the client computer, the corpus does not need to be uploaded to a central server. Although originally designed for use with classical Chinese materials, most processing is language independent, and analyses can be performed on written works from many languages by use of appropriate textual transformations.

Text reuse in the Mozi, Xunzi, and Zhuangzi visualized as a network graph.

External links