Computerizing the Scoring Process

One of the first hurdles in developing an automatic system of speech content analysis has been that a person (instead of a machine) has had to label each word in a speech transcript with the appropriate syntactical tag indicating how the word is used in a sentence. Without the use of an automated parser, several early and interesting attempts have been made to apply computer techniques to content analysis. Philip Stone and his colleagues have pioneered a large group of these studies and have developed computer programs capable of classifying content (the General Inquirer System) and of ordering these content categories with one another in interesting ways. Benjamin Colby has also successfully used a computer to perform content analysis of primitive folk tales from Eskimo, Japanese, and Ixilmaya cultures.

In the field of psychiatry and psychoanalysis, the attempts to use computerized methods to analyze content have been limited mostly to the analysis of various classes of words that manifestly denote certain psychological categories, such as love, anxiety, hostility, intellectual processes, and so forth. Most of the automated content analysis projects have been based on single-word or single-phrase tag schemes. The shortcomings of these systems are mainly that they discard too much highly pertinent information. They fail to identify who did or felt what about whom. They throw away the meaningful classification of referents, such as "it," "that," "which," "those," "these," and so forth. They ignore the scoring of emotionally charged words that, out of context, cannot be properly classified, such as, "get" as in "I'll get you" or "bucket" as in "He kicked the bucket." They entirely miss the meaning of idiomatic or colloquial expressions, as in the latter examples.

The goal we set was to develop computer software that was able to understand grammar and syntax, that could parse natural language, and that could be taught to understand idioms and slang. Collaborating with two computer scientists, Gottschalk joined Hausmann and Brown, and using a PDP-10 computer, demonstrated that the Gottschalk-Gleser Hostility Outward scale could be successfully machine-scored from typescripts of speech. They used a parser, namely Wood's Augmented Transition Network parser that was translated into UCI LISP, and they modified this software to run on a PDP-10. They changed its grammar to cover certain linguistic constructions that frequently occur in spoken discourse. In addition, a small dictionary of several hundred entries was created which could be maintained in the computer core. Since the Gottschalk-Gleser content analysis method derives a score on the basis of the action verb in a clause in conjunction with noun-phrases that function as actors and recipients of this action, a technique was developed for assigning meaning to each of these constituents. Verbs were assigned semantic features called "verb-types" based on the thematic categories and their weights on the Gottschalk-Gleser Hostility Outward scale. In initial testing of this automated method on 100 sentences taken at random from the Manual of Instructions for Using the Gottschalk-Gleser Content Analysis Scale, 60% were correctly recognized, parsed, and scored. The typescripts of six five-minute speech samples were also scored for hostility outward by expert human content analysis technicians, and these scores correlated 0.80 (by a Spearman rank difference method) with the scores obtained by the computerized method. This result was considered equal to the lowest level criterion for acceptable human intercoder reliability in scoring the Gottschalk-Cleser Content Analysis scales. But the computer scoring missed many codable categories readily recognized by human scoring.

In 1982 Gottschalk and Bechtel reported research in which they developed a computerized method of scoring the Gottschalk-Gleser Anxiety scale. The computer software used was again written in UCI LISP, running on a mainframe-class computer. Whereas the average computer-derived anxiety score from 25 five minute speech samples was significantly lower than the average anxiety score obtained by human scoring, the intercorrelations between the two sets of anxiety scores was highly significant for total anxiety scores (r=0.85, p<.0001). The intercorrelations for the six anxiety subscale scores ranged from 0.58 (for shame anxiety) to 0.92 (for mutilation anxiety).

A few years later, Gottschalk and Bechtel developed a PC-based (DOS) program to the problem, demonstrating much improved results with respect to the computer's ability to recognize scorable clauses applicable to both the Gottschalk-Gleser Anxiety and three Hostility scales. Interscorer reliability between automated and human scoring was in the range of 0.80 and above for total scores and most subscale scores.

Since that time, the system has been converted to run under Microsoft Windows (all versions since Windows 3.1), and has added support for many additional scales. Other improvements have focused on support for research and clinical use, including multiple sample support, flexible text output options, and spreadsheet-compatible score file generation.

An "off-the-shelf" computer-based tool

Return to scale development
Return to home page