IT Panel Abstracts

Presentation of Online Projects 


Tibetan Information Technology Panel
E. Gene Smith Memorial Session

Chairs: Paul G. Hackett
(Columbia University)
Lauran Hartley
(Columbia University)
Susan Meinheit
(U.S. Library of Congress)
Tashi Tsering
(China Tibetology Research Centre)

Abstracts (in order of presentation)

Jeff Wallman
Tibetan Buddhist Resource Center (TBRC)
“Preservation & Access of Tibetan Texts in a Global Setting”

Digital libraries hold an extraordinary promise, yet the challenges of developing Tibetan digital libraries are significant. Behind the work of E. Gene Smith is a mission for Tibetan Buddhist Resource Center that addresses some of the problems of preservation and access for Tibetan digital libraries and offers a vision for the future. This presentation identifies the challenges of preservation and access generally, and in particular, as they relate to Tibetan digital libraries. The challenges of preservation include technological barriers to entry, data protection and disaster recovery, digital obsolescence, the size and structure of Tibetan collections, differentiating data and metadata, the cost of building and maintaining digital libraries, and the issues of ownership and secrecy. The challenges of access include multilanguage user interface design, the competing needs of a wide variety of users, the insufficiency of core bibliographic metadata, the structure and size of Tibetan collections, the lack of comprehensive classificatory schemas, the digital divide and the responsibility of digital libraries beyond the digital form, and the true costs of developing and maintaining multilanguage resources. Drawing on methodologies in information library science and enterprise software engineering, this presentation presents the suite of integrated solutions implemented in the architecture of the TBRC Library. The presentation concludes by identifying solutions implemented in the TBRC Library.

Paul G. Hackett
Columbia University
“Digital Resources for Research and Translation of the Tibetan Buddhist Canon”

Building on previous research, this paper presents an overview of the bibliographic and lexical resources currently being deployed online for students and scholars interested in translating classical Tibetan literature, and in particular, the Kangyur and Tengyur, and related literature. The database being demonstrated has been designed as a comprehensive central index for a wide-range of resources — scanned page images and e-text of both primary and secondary texts. These resources have been linked both between themselves and to additional external resources, together with bibliographic references and hypertext links. This presentation will provide examples of the use of this database highlighting the depth of annotation and linking available on the site, and well as different forms of research enabled by it.

Charles Manson
Bodlein Library, Oxford University
“Development of a digital catalogue of the Bodleian Tibetan manuscripts: Bod Karchak@the Bod”

A presentation of the ongoing development of the digital catalogue of the Tibetan manuscripts held at the Bodleian Libraries, Oxford. The collection contains purchases and donations from (chronologically) the Schlagintweit brothers, Csoma de Koros, Capt. L.A. Waddell, The Indian Government, W.Y. Evans-Wentz, Michael Aris, Hugh Richardson, T. Skorupski. The presentation will firstly describe the contents of the Bodleian Tibetan Manuscripts collection and the status of earlier catalogues, and then give details of the processes of developing a digital catalogue of the manuscripts – named Bod Karchak – along the lines of the Fihrist digital catalogue for Arabic manuscripts. Bod Karchak is intended to be part of the process of developing a union catalogue for the Tibetan Manuscripts held in libraries in the UK, and will have the facility to be read in Tibetan script as well as several transliteration systems, thus demonstrating some of the special advantages of digital technology for manuscript cataloguing.

Kelsang Tahuwa
Digital Preservation Society (DPS), Tokyo
“DPS electronic editions of the Them spangs ma manuscript bka’ ‘gyur (with comparative catalogue) and the Peking bka’ ’gyur housed in the National Library of Mongolia”

Discussion of two DPS electronic bka’ ‘gyur collections: rationale for their creation and a brief history of DPS, technical challenges and solutions, and format for distribution. It is to be followed by a demonstration and questions from the audience. Time allowing, a presentation and brief discussion of the newly compiled comparative catalog for the them spangs ma bka’ ‘gyur.

Robert R. Chilton
Orient Foundation for Arts and Culture
“The Classical Tibetan Knowledge Archive and Multimedia Study Resource – a new research tool for the study of the oral commentarial and ritual arts traditions of Tibet”

This paper presents the Classical Tibetan Knowledge Archive and Multimedia Study Resource — a new research tool for the study of the oral commentarial and ritual arts traditions of Tibet created under the auspices of the Orient Foundation for Arts and Culture (OFAC) <>. The Archive and Multimedia Study Resource is a unique and extensive archive of documentary materials including over 14,500 hours of detailed line-by-line oral commentaries to classical Buddhist and Bon texts given by leading scholars and lineage holders, and more than 600 hours of detailed video documentation of the classical dance, music, and ritual arts traditions. The presentation includes a demonstration of the multimedia archive — now accessible online at <>.

Susan Meinheit
(U.S.) Library of Congress
“Digital projects at the Library of Congress”

The Asian Division of Library of Congress currently has about 16,115 volumes of Tibetan texts.  Of these, some 3,600 volumes are considered “rare” in that they were not acquired via the New Delhi Office in the well known PL-480 or its successor, the SFC program or via standard current acquisitions. These collections can be divided into general categories: 1) Kangyur/Tengyur;  2) Early 20th c. Collections: primarily acquired by Joseph Rock, William Rockhill, and Berthold Laufer between 1899 and 1941; 3) Modern xylograph/reprint collections in limited editions, such as recent prints from Dege Parkhang; 4) Fragments and 5) Artifacts and thangkas. It is the first two which are the current focus of efforts to digitize our Tibetan collections. This presentation offers some brief updates on several projects that were first presented at the IATS-XIII (Vancouver) conference.

Alexander Gardiner and Asha Kaufman
Rubin Foundation
“Collaborative On-line Database Publication: The Treasury of Lives”

This paper will describe the Treasury of Lives ( as a collaborative on-line database and provide a tour of its significant features and contributions to the field of Tibetan and Himalayan Studies. The website, a biographical encyclopedia of Himalayan religion, is a collaborative project created and operated by the Shelley & Donald Rubin Foundation in New York. Over 40 authors have contributed its 700+ biographies to date; as of March 2012 all new content is peer reviewed by Editorial and Advisor Committees comprised of scholars in North America and Europe. The website is creating comprehensive biographical portraits of teaching traditions, monasteries and incarnation lines. With its targeted facet feature, users can browse via specific combinations of institution, place, tradition, and time period. Another topic to be discussed is the phonetic standards that was developed with the Tibetan Buddhist Resource Center and the Rubin Museum of Art, and the dynamic search capacity that allows users to use a wide range of possible phonetic renderings to arrive at the proper result. The expanding collaborative features of the site include pedagogical components that will establish the site as a valuable teaching tool.

Klu tshang rDo rje rin chen
[“On the Necessity of Digitizing Documents Stored in the Library of Bla brang Monastery”]

To be added.



Presentation of Research and Resources

Kiyonori Nagasaki and Toru Tomabechi
[“International Institute for Digital Humanities”]
“Indo-Tibetan Lexical Resource (ITLR), a Collaborative Project of the Khyentse Centre for Tibetan Buddhist Textual Scholarship (KC-TBTS) at the University of Hamburg and International Institute for Digital Humanities (DHII), Tokyo, and the SAT Daizokyo Text Database at the University of Tokyo”

This presentation provides a brief introduction to the Indo-Tibetan Lexical Resource (ITLR), a digital (online) reservoir of Sanskrit (including Buddhist-Hybrid Sanskrit or Middle Indic) words/phrases and names of persons, places, and scriptures/treatises with corresponding attested Tibetan translation(s), including a host of other information pertinent to the main entries, all of which are backed up by references to primary and secondary sources. Our introduction will cover the concept of the ITLR project in general, its inception, the current state of affairs, and our short as well as long term objectives, thereby focusing on how we are proceeding towards the creation of a research tool that is reliable, continually improvable, infinitely extendable, and easily accessible not only to those who pursue textual studies by employing historical-philological tools and techniques but also to those who are involved in translation projects on a personal or institutional level. The presentation first introduces the structure of the database in general, then highlights the possibilities it provides for both contributors and potential users, and concludes by presenting some sample entries of varying kinds.

Pavel Grokhovskiy
Saint-Petersburg State University
“The Basic Corpus of the Classical Tibetan Language with a Russian Translation and a Lexical Database”

This paper presents the ongoing work of the working group of the project “The Basic Corpus of the Classical Tibetan Language With a Russian Translation and a Lexical Database.” The foundation for the construction of the corpus of the Classical Tibetan language is a set of texts representing various genres of the Classical Tibetan literature (in form – both prosaic and metric; in content – historical, biographical, epic, folklore, philosophical, didactic, doctrinal, as well as dedicated to medieval sciences). The corpus aims at: 1) linguistic research of the Tibetan language, 2) lexicographical description of the corpus language materials in the format of a lexical database.

Lobsang Monlam
Monlam Tibetan Information Technology Research Center
“On the Framework of the Complete Monlam Tibetan Dictionary”

སྨོན་ ལམ་བོད་ཀྱི་ཀུན་བཏུས་ཚིག་མཛོད་ཆེན་མོ་རོམ་རྩོམ་སྒྲིག་དང་བཟོ་སྐྲུན་བྱེད་ པའི་དགོས་དམིགས་ནི། བོད་དང་ཧི་མ་ལ་ཡའི་མི་རིགས་ཁག་སོགས་སྐད་ཡིག་རིག་གནས་གཅིག་གྱུར་ཡིན་པའི་ སློབ་གཉེར་པ་དང་། ཞིབ་འཇུག་པ། དེ་བཞིན་ཤེས་ཡོན་གྱི་ལས་ཀ་སྒྲུབ་མཁན་ཚོར་བོད་ཡིག་མཉེན་ཆས་ཚིག་མཛོད་ཅུང་ཚད་ ལྡན་ཞིག་མཁོ་སྒྲུབ་བྱ་རྒྱུ་དང་། བོད་རིག་པར་ཞིབ་འཇུག་དང་སློབ་གཉེར་བྱེད་པའི་ཕི་རྒྱལ་མཁས་པ་དང་། སློབ་ཕྲུག་སོགས་ལ་དེང་རབས་ཀྱི་ཆུ་ཚད་ཡོད་པའི་བོད་ཡིག་མཉེན་ཆས་ཚིག་མཛོད་ཅིག་ མཁོ་སྒྲུབ་བྱེད་རྒྱུ་དེ་ཡིན། The Complete Monlam Tibetan Dictionary aims to compile an authorative Tibetan dictionary covering the Tibetan terms and words in entirety, both secular as well as spiritual. The methodologies ratified towards that end are vast and varied, and involves the collaboration of Tibetan scholars everywhere, including inside Tibet. The nature of our endevour demands editorial of the expertise. As of now, we have approached the concerned authorities of the 4 major sects of Tibetan Buddhism, including Jonang and Bon to set up separate editorial sections, and the response has been overwhelming. We saw the need to collaborate on a work of this scale, and most of all, for reasons dealing with authenticity. However, we will be overseeing the logistics and action plans of this monumental project. It aims to develop multi-device applicable digital dictionary software and applications for the convenience of a growing body of Tibetan scholars. It also aims to help overseas scholars and tibetologists to have access to reliable sources at a click away.

Edward Garrett
School of Oriental and African Studies (SOAS), University of London
“Natural Language Processing (NLP) Pipelines for Tibetan Corpora”

This presentation explores the use of Apache’s Unstructured Information Management Architecture (UIMA) as a bridge between the user’s search experience (exemplified by Solr), and back-end NLP and machine learning technologies. It shows how UIMA components such as ClearTK can be adapted for Tibetan and plugged into existing sites with considerable benefit and minimal disruption.

Tashi Tsering
China Center for Tibetology
“Introducing Two Tibetan Unicode Fonts and One Tibetan Input Method for Windows System”

At the 12th International Seminar for Tibetan Studies in 2010, on behalf of our project members from China Tibetology Research Center (CTRC), I formally introduced ten Unicode Tibetan fonts and one Tibetan Wylie keyboard, which were quite unique at that time. The Tibetan Unicode fonts are named Qomolangma Tibetan Fonts, and the keyboard is called Qomolangma Wylie Keyboard, and they have being used by Tibetan computer users from the community around the world. Three years later, as a continuous work from the same project group, we have developed two more Unicode Tibetan fonts and one Tibetan input method for Windows system. The two new Tibetan fonts are Ume fonts with different style. The input method is much different from other Tibetan keyboards while the keyboard layouts are familiar with most of users. It is based on Tibetan words and phrases while the input method supports the two keyboard layouts which are Wylie keyboard layout and China national standard Tibetan keyboard layout. The input method is called Qomolangma Tibetan Phrase-Based Keyboard.

Nachum Dershowitz and Lior Wolf
The School of Computer Science, Tel Aviv University
“Automatic Scribal Analysis of Tibetan Writings”

With the increasing access to old Tibetan manuscripts and xylographs in recent decades, scholars of Tibetan textual studies are faced with new challenges and opportunities. Whereas, until recently, the content of this material has garnered the bulk of researchers’ attention, we are seeing increasing interest in codicological, paleographical, and material aspects of these documents. Following a successful interdisciplinary workshop held in Hamburg, we have been collaborating with Orna Almogi and Dorji Wangchuk (University of Hamburg) in analysing Tibetan manuscripts. We apply the same methods to these Tibetan manuscripts as have been successful in our recent work with the Cairo Genizah. The Genizah is a collection of handwritten documents containing some 350,000 fragments discovered in Cairo in the late 19th century. Most fragments were written between the 10th and the 14th centuries, almost all of them in Hebrew characters, but in a variety of languages (Hebrew, Judeo-Arabic and Aramaic). Today, the fragments are spread out in more than seventy collections worldwide. Using computer-vision and machine-learning algorithms, we have been able to automatically classify Genizah manuscripts by script style and to identify hundreds of new “joins”, that is, matches between leaves in the same hand and originally part of the same manuscript, but now catalogued separately. Initial experiments were conducted with the first 30 volumes of the bKa’ gdams gsung ‘bum collection. These volumes contain 123 different manuscripts written in a variety of scripts, ranging from dBu can to different kinds of dBu med, and in various hands. Our results show that the same software is able to accurately match manuscripts that were written in the same script and subtypes of scripts. Possibly, pending further verification, scribal matching (identification of the same hand) can also be achieved. In addition, our software provides codicological meta-data about each page including the number of lines, the size of the characters, the density of writing, and other structural information.

Michael Sheehy
Tibetan Buddhist Resource Center (TBRC)
“Charting Par khang Culture: Towards an Analytics of Early Xylographic Literary Production in Tibet”

Analyzing the production of printed texts up to the establishment of the great eighteenth century publishing houses, we will present a preliminary survey of par zhing (woodblock) prints and their related par khang (printeries) in Tibet. Based on TBRC Library digital assets and interpretive data models, this research project deciphers the distinguished provenance of each woodblock by its publication information, correlating each xylographic work at a printery. With correlates of the woodblocks to their place of production, each printery is identified descriptively or by a GIS code located on a map of the Tibetan plateau. Dates for when a printery was established and when a work was printed enable us to chart a timeline of early Tibetan print culture. Such data sets are prime for visualizing the spatial and temporal production of this literature. The import of such a digital research project is that it gives a fuller understanding of patterns and trends in the cultural history and geography of book printing in Tibet. This paper will systematically walk through the steps in this process, discussing both technical and scholarly methodologies employed in the digital research library. We will conclude with remarks on possibilities, assumptions and limitations of Tibetological research in the Digital Humanities.

Lauran Hartley
Columbia University
“Re-examining the Role of University Libraries in the Service of Tibetan Studies”

Drawing on data from academic research library databases and with reference to current trends in university library and information science, this paper aims to identify challenges in the current model and practice of Tibetan Studies librarianship at North American universities. Taking as a premise that it is the role of academic research libraries not just to collect, but also to organize, preserve, and make knowledge accessible, this paper then explores models for how academic research libraries can better leverage and parlay their institutional strengths in a new environment through partnerships with private initiatives, cooperative projects across academic institutions, and in more robust support of classroom instruction and research.



Roundtable Discussions

Burkhard Quessel (moderator)
British Library
“Tibetan digital humanities: the wider perspective”

A discussion of broader issues in the field of digital humanities specifically pertaining to Tibetan.

Jeff Wallman (moderator)
Tibetan Buddhist Resource Center (TBRC)
“Technical issues and future directions”

A discussion of technical issues pertaining to research and deployment of digital tools and projects.