Return-Path: <nifl-esl@literacy.nifl.gov> Received: from literacy (localhost [127.0.0.1]) by literacy.nifl.gov (8.10.2/8.10.2) with SMTP id g0OBgBn19861; Thu, 24 Jan 2002 06:42:11 -0500 (EST) Date: Thu, 24 Jan 2002 06:42:11 -0500 (EST) Message-Id: <002b01c1a4cc$2cf01620$09150785@fedu.fukuiu.ac.jp> Errors-To: listowner@literacy.nifl.gov Reply-To: nifl-esl@literacy.nifl.gov Originator: nifl-esl@literacy.nifl.gov Sender: nifl-esl@literacy.nifl.gov Precedence: bulk From: "Charles Jannuzi" <jannuzi@edu00.f-edu.fukui-u.ac.jp> To: Multiple recipients of list <nifl-esl@literacy.nifl.gov> Subject: [NIFL-ESL:7013] Phonemes vs. gestural routines X-Listprocessor-Version: 6.0c -- ListProcessor by Anastasios Kotsikonas X-Mailer: Microsoft Outlook Express 5.50.4807.1700 Content-Transfer-Encoding: 7bit Content-Type: text/plain; Status: O Content-Length: 5627 Lines: 127 Jannuzi vs. Nissen Nissen vs. Jannuzi Decide for yourselves. Actually, realize this is a good natured exchange between John N. and me. My point is, there are theories and models which support other ways to account for speech production, transmission and perception. Phonemic accounts: http://www.arts.uwa.edu.au/LingWWW/LIN101-2001/NOTES-101/phonologyI.html http://www.ling.umd.edu/pablos/Phony_h2.htm http://www.ling.udel.edu/kabak/conf2001/abstracts/sung.html Gestural http://www.haskins.yale.edu/haskins/MISC/RESEARCH/GesturalModel.html http://www.sign-lang.uni-hamburg.de/intersign/Workshop2/CrashbornHulstKooij/ crasbor_hulst_Kooij.html#A4_2 (this might support John, but it might support me, interesting anyway) And this from http://www.indiana.edu/~srlweb/publication/manuscript211.pdf The process of speech perception may be limited to the auditory channel alone as in the case of a telephone conversation. However, in everyday spoken language the visual channel is also involved as well and the study of multi-modal speech perception and spoken language processing is one of the central areas of current research. While stimulus variability, perceptual constancy, and neural representation are core problems in all areas of perception research, speech perception is unlike other perceptual processes because the perceiver also produces spoken language and therefore has intimate knowledge of the signal source. This relationship, combined with the high communicative load of speech constrains the signal significantly and affects both perception and production strategies (Lieberman 1963; Fowler & Housman, 1987; Lindblom, 1990). Speech perception is also unique in its remarkable robustness in the face of a wide range of environmental and communicative conditions. The listener’s remains remarkably constant in the face of a significant amount of production related variation in the signal. Furthermore, even in the worst of environmental conditions in which large portions of the signal are distorted or masked, the spoken message is recovered with little or no error. As we shall see, part of this perceptual robustness derives from the richness and redundancy of information in the signal, part of it lies in the highly structured nature of language, and part comes from the context dependent nature of spoken language. Extracting meaning from the acoustic signal may at first glance seem like a relatively straightforward task. It would seem to be simply a matter of identifying the acoustically invariant characteristics in the frequency and time domains of the signal that correspond to the appropriate serially ordered linguistic units (i.e. reversing the encoding of those mental units by the production process). From those units the hearer can then retrieve the appropriate lexical entries from memory. Although stated rather simply here, this approach is based on an assumption about the process of speech perception that has been at the core of most symbolic processing approaches (Studdert-Kennedy, 1976). That is, the process involves the segmentation of the signal into discrete and abstract linguistic units such as features, phonemes, or syllables. Before or during segmentation the extra-linguistic information is segregated from the intended message and is processed separately or discarded. For this process to succeed, the spoken signal must meet two conditions The first, known as the invariance condition, is that there is invariant information in the signal that is present in all instances that correspond to the perceived linguistic unit. The second, known as the linearity condition, is that the information in the signal is serially ordered so that information about the first linguistic unit precedes and does not completely overlap or follow information about the next linguistic unit and so forth. It has become apparent to speech researchers over the last 40 years that the invariance and linearity conditions are almost never met in the actual speech signal (Liberman, 1957; Chomsky & Miller, 1963; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967). This has led to several innovations that have achieved varying degrees of success in accommodating some of the variability and much of the nonlinearity inherent in the speech signal (Liberman, Cooper, Harris, & MacNeilage, 1963; Liberman & Mattingly, 1985; Blumstein & Stevens, 1980; Stevens & Blumstein, 1981). However, inter- and intra-talker variability remains an intractable problem within these conceptual/theoretical frameworks. Recent approaches that treat the signal holistically have proven promising alternatives. Much of the variability that researchers sought to strip away in traditional approaches contains important information about the talker and about the intended message. Recent approaches, while differing significantly in their view of perception, treat the signal as information rich. The information in the speech signal is both ‘linguistic’, the traditional message of the signal, and ‘non-linguistic’ or ‘indexical’ (Abercrombie, 1967; Ladefoged & Broadbent, 1957), information about the talker’s immediate physical and emotional state, about the talker ’s relationship to the environment, the social context, etc. (Pisoni, 1996). Much of the variability and redundancy in the signal can be used to enhance the perceptual process rather than being discarded as noise (Klatt, 1976, 1989; Fowler, 1986; Goldinger, 1990; Johnson, 1997). This whole paper is well worth reading, though it is a slog, I admit. Charles Jannuzi
This archive was generated by hypermail 2b30 : Fri Jan 17 2003 - 14:43:56 EST