CHinese ANcient Texts (CHANT) update

The Chinese University of Hong Kong’s subscription-based CHinese ANcient Texts (CHANT) website has recently undergone some renovation. The new interface appears to largely follow the layout and functionality of the previous version, although there seem to be a few puzzling changes, such as the omission of source text explanations and lists of emendations, both of which were previously available at least for most texts in the Pre-Qin and Han section. A new feature is the option to display a text with or without the corrections applied to it, by clicking the new “校改” and “原文” buttons.

The site now makes extensive use of Javascript to fetch pages, which can be frustrating as there is no page load indication. A huge benefit of the new site however is that the annotations finally display correctly on browsers other than Internet Explorer.

Coinciding with the change, it has become possible (at least at the institution that I’m based at) to access more areas of the database, though it’s unclear whether this is due to an accidental change, a policy decision, or the library paying a higher subscription rate.

Posted in Digital humanities | Comments Off on CHinese ANcient Texts (CHANT) update

Reviews of Digital Resources for Sinologists

Titled “Digital Resources for Sinologists 1.0“, this post by Holger Schneider and Jeffrey Tharsen gives a useful overview of digital dictionaries and other online tools for Chinese, many of which will be of interest to Sinologists, as well as offering some background on general aspects of digital dictionaries of Chinese.

Separated into two parts, titled “An Introduction to Chinese Electronic Dictionaries and Criteria for Their Evaluation” and “An Annotated List of Common Digital Dictionaries / Lexical Tools / Learning and Translation Tools / Encyclopædia”, it provides a valuable guide to many of the most important tools and databases in the field, as well as introducing a few less well-known resources that may also be of interest.

Posted in Digital humanities | Comments Off on Reviews of Digital Resources for Sinologists

Classical Chinese Wordles

Xunzi

The ever-popular Wordle, like many tools designed to work with digital corpora, can be used on Chinese text with minor tweaking. Wordle takes a text and ranks the words in it in order of frequency, then produces a tag cloud that gives a visual summary with more frequently occurring words in larger letters. Though many tools do this, Wordle’s output is often particularly attractive.

To use Wordle with Chinese, firstly the text has to be split into words using spaces or other punctuation; if not, Wordle will treat each phrase as if it were a word. So instead of “孟子見梁惠王。”, we really want “孟子 見 梁惠王。”. Adding a space between each character is a reasonable approximation for classical Chinese, but obviously means that proper names like “孟子” don’t get treated correctly. Once the text is ready, it can be pasted straight into the Wordle tool (this requires that Java is installed and enabled in your browser). With Chinese text, there are a couple of extra steps. Firstly, on my system at least the default font used doesn’t work for Chinese, so initially instead of Chinese words I get empty boxes. To fix this, go to the Wordle font menu and choose a different font (e.g. “Chrysanthi Unicode”, which seems to work). Secondly, Chinese seems to be detected by Wordle as Arabic, and this results in random words being omitted; click on the “Language” menu in Wordle, and change the setting to “Do not remove common words”.

Hanfeizi

The tag clouds here are of the full texts of the Mozi, Mengzi, Hanfeizi, Xunzi, and Daodejing from the Chinese Text Project – can you work out which is which?

Mozi

Wordle has the option to automatically remove some of the most common words in a language from the list – so that uninteresting words such as “a”, “the”, “of” and so on don’t appear as giant words overwhelming the tag cloud. Since Wordle doesn’t have a list for classical Chinese, I excluded a fairly arbitrary set of words from the input to produce these images: 也 之 以 則 而 其 曰 者 於 與 于 不. Other particles such as 矣 should probably also be added to this list.

This highlights an important difficulty with word clouds in classical Chinese, however. Words like “無”, “為”, and “有” are very common in classical Chinese texts, but they are also philosophically interesting – in certain contexts and usages. Similarly “故” is a very common and not terribly interesting sentence connective meaning something like “thus” or “therefore”, but is also used to mean “cause”; “是” often simply means “this”, but can also mean “right”, “approve”, or “correct”.

Daodejing

As a result, a highly prominent appearance of 無 and 為 as in some of these Wordles isn’t necessarily an indication that the source was a Daoist text like the Daodejing – in fact if you look closely, you’ll see that in all of these texts 無 and 為 appear fairly often.

Mengzi

Even with these caveats however, this is a much more interesting and aesthetically pleasing way to look at the data than browsing a table of word frequencies.

 

Posted in Digital humanities | Comments Off on Classical Chinese Wordles

Classical Chinese internet resources

A huge though largely unsorted list of Chinese language web sites and resources related to the study of early China has been assembled here:
http://ctext.org/discuss.pl?if=en&thread=3223065

Posted in Digital humanities | Comments Off on Classical Chinese internet resources

Yīntōng: Chinese Phonological Database

Yintong is an online database of characters in the Guǎngyùn 廣韻, a dictionary dating from 1008 C.E., created by David Prager Branner.

The database has the following main functions:

  • Lookup by character, returning information about the fǎnqiè associated with the character, the phonological values represented by those fǎnqiè, and the page number of the Guǎngyùn where that reading appears.
  • Lookup by medieval Chinese reading, returning a list of the other characters in the same xiǎoyùn.
  • Lookup by two medieval Chinese readings, returning a list of any characters appearing in both xiǎoyùn.
  • Lookup by multiple characters, returning a transcription of each character based on the Guǎngyùn’s readings.

Further details: http://yintong.americanorientalsociety.org/html/about.htm.

Posted in Digital humanities | Leave a comment