MARKUS

From Digital Sinology
Jump to: navigation, search

MARKUS is an online annotation and text analysis platform with a wide range of functionality including 1) automated tagging and identification of personal and place names, official titles, and time references in classical Chinese; 2) manual tagging of user-supplied keyword lists in all languages and creation of custom tags; 3) generation of keywords based on text analysis (keyword clipper); 4) flexible filtering of tagged content; 5) linking to a range of online reference tools including geographical and biographical databases and language and domain-specific dictionaries for online reading; 6) export to wide range of formats including html and TEI to ensure interoperability; 7) automatic export of tagged content and linked data from China Biographical Database, TGAZ, and TWGIS to visualization platforms for exploration and analysis of tagged content through the associated VISUS visualization interface (maps, network graphs, tables, timelines, pie charts, tagclouds) ; 8) importing texts from textual databases such as Donald Sturgeon's ctext.org through plugins such as the Chinese Text Project plugin for easy import of broad range of texts; 9) machine learning to improve accuracy and recall for large corpora; 10) free account and flexible file management facilitating batch tagging with kw lists or regular expressions as well as export to other text analysis and visualization platforms including PALLADIO, PLATIN, DOCUSKY, and COMPARATIVUS. Instructional videos and materials in English and Chinese as well as a forum for the sharing of use cases and the discussion of user experience are also available. Documentation is provided through github.

MARKUS was developed by Brent Ho and Hilde De Weerdt with funding from the European Research Council and Digging into Data. Keyword clipper was contributed by Hsiang Jieh, Tu Hsieh-chang et al. (National Taiwan University). The machine learning module was developed by Miao Shengfa.

External links