Date: Sun, 13 Jul 2008 07:27:01 +0100 From: "Humanist Discussion Group \(by way of Willard McCarty <willard.mccarty-AT-mccarty.org.uk>\)" <willard-AT-LISTS.VILLAGE.VIRGINIA.EDU> Subject: 22.116 understanding language To: <humanist-AT-Princeton.EDU> Humanist Discussion Group, Vol. 22, No. 116. Centre for Computing in the Humanities, King's College London www.kcl.ac.uk/schools/humanities/cch/research/publications/humanist.html www.princeton.edu/humanist/ Submit to: humanist-AT-princeton.edu  From: Ms Mary Dee Harris <marydeeh-AT-yahoo.com> (22) Subject: Re: 22.114 understanding language (in the right way)?  From: amsler-AT-cs.utexas.edu (44) Subject: 22.114 understanding language (in the right way)? -------------------------------------------------------------------- Date: Sun, 13 Jul 2008 07:21:14 +0100 From: Ms Mary Dee Harris <marydeeh-AT-yahoo.com> Subject: Re: 22.114 understanding language (in the right way)? I initially missed the date of the quotation when I read the post, but when I saw Willard's comment asking about accomplishments "in the last 46 years", I wasn't surprised. While I believe the statement is largely still true, that we don't understand the underlying mechanisms of language, there is considerably more understanding of the usage of language, due to statistical studies in the last decade. I recently attended the ACL conference in Columbus, Ohio, and found that most papers reported on the use of statistical analysis of corpora, in one way or another. With the availability of large-scale corpora in a number of different languages, more and more work is being presented to show how language is used. While I was not able to hear all the papers given the multiple simultaneous tracks, it was apparent to me that computational linguists still have a ways to go in finding the underlying mechanisms and understanding language as a self-organizing system. So I would say that we have taken Hillel's first step: "Any work in this field must start from an analysis and understanding of language use" but we haven't moved very far past that. Mary Dee Harris Chief Language Officer Catalis, Inc. Austin, TX -------------------------------------------------------------------- Date: Sun, 13 Jul 2008 07:24:33 +0100 From: amsler-AT-cs.utexas.edu Subject: 22.114 understanding language (in the right way)? Ah... Pivotal moments in the history of computational linguistics. The thing to remember is that the entire field of computational linguisics came into existence as a result of this conclusion. The field didn't exist until the efforts to use computers to do translation failed so dramatically back in the 1960s. In the USA the result was the now infamous ALPAC report, that effectively said that machine translation was an area of research--not implementation. It cut the funding by non-research government agencies for building machine translation systems and started the research field of computational linguistics. Yes, we're closer to modeling language as a self-organizing system...which is pretty much what using neural networks to model language phenomena have done, but neural networks aren't responsible for most of the advances in computational linguistics over these last 46 years. Those came from hard reasoning about language and efforts to construct and test programs using those reasoning principles--and from the collection of data about words and language use--and from exponential improvements in computer hardware. The advances have been significant. Grammar was given a formal basis by Chomsky, which broke its exclusive ownership by linguists and offered mathematicians and those new people who worked with computers the ability to experiment with it. The new fields of speech synthesis, speech recognition (now an everyday experience over telephones and through interactions with electronic gadgetry in cars and computers) came into existence and became commercial successes. Machine translation led to the creation of terminology banks that substituted data for expertise in performing translations. Parsing algorithms pushed the development of computational mechanisms for using grammar and lexicon to fuel deconstruction (parsing) and reconstruction (generation) of language. Progress was continually pushed onward by the relentless pressure from computer hardware developments that made each order of magnitude increase in computer speed seem like it ought to solve the problems with the last generation. Text corpora were created for millions, billions, and now trillions of words. So much text was captured that statistical techniques for extracting lexicon and understanding language phenomena became possible, if not essential, as hand-analysis of all the data became impractical. We still don't have the theoretical understanding of language expected to be possible back in the 1960s, but it is hard to be that critical of the accomplishments. Not knowing how hard the problem was can only make those early criticisms about the lack of progress sound like complaints born of hubris. In the end, I tend to think tinkering has been pretty successful and that theoretical developments come as much from seeing the imperfect constructs that others have cobbled together as they do from theoretical preconception. Sometimes you just have to throw a rock in the water to see what happens rather than think through what theoretically might happen.
Display software: ArchTracker © Malgosia Askanas, 2000-2005