Corpus Studies of Constituent Ordering Tom Wasow An example, from Steven Pinkers The Language Instinct, p. 131: In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document in great detail the psychology of linguistic rules from infancy to old age in both normal and neurologically impaired people, in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes. One of the other 119 possible orders:
In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document the psychology of linguistic rules in great detail in both normal and neurologically impaired people, from infancy to old age in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes. And another order ?? In my laboratory we use it as an easily studied instance of mental grammar, allowing us to document in much the same way that biologists focus on the fruit fly Drosophila to study the machinery of the genes
in both normal and neurologically impaired people, in great detail the psychology of linguistic rules from infancy to old age What makes some orders sound more natural than others? The answer might shed light on the psychological processes underlying language use. It might also have practical applications: for on-line style checkers
for machine translation for other applications requiring robust generation The Alternations I Studied Heavy Noun Phrase Shift: We take too many dubious idealizations for granted. We take for granted too many dubious idealizations. The Verb-Particle Construction: We figured out the problem.
We figured the problem out. Dative Alternation: Kim handed a toy to the baby. Kim handed the baby a toy. Factors I Looked At Structural complexity (or weight) Discourse status (or newness) Semantic connectedness of verb and
following constituents Lexical biases of verbs Ambiguity avoidance Grammatical Weight Behaghels Gesetz der Wachsenden Glieder: Von zwei Gliedern von verschiedenem Umfang steht das umfangreichere nach. Translation Law of Growing Constituents: Of two constituents of different size, the larger one follows the smaller one
In other words: Simple phrases precede complex ones. Many Proposals to Make Behaghels Generalization Precise Some absolute, others relative Some categorical, others graded Corpus data support relative, graded definition
Various proposed measures are so highly correlated that they cant be distinguished Categorial Weight Definitions An NP is heavy if it "dominates S [Ross (1967, rule 3.26)] "the condition on complex NP shift is that the NP dominate an S or a PP" [Emonds (1976; 112)] "Counting a nominal group as heavy means either that two or more nominal groups...are coordinated...., or that the head noun of a nominal group is postmodified by a phrase or clause" [Erdmann (1988; 328), emphasis in original] "the dislocated NP [in HNPS] is licensed when it contains at least two
phonological phrases" [Zec and Inkelas (1990; 377)] "it is possible to formalize the intuition of 'heaviness' in terms of an aspect of the meaning of the constituents involved, namely their givenness in the discourse" [Niv (1992; 3)] Graded Weight Definitions Number of words dominated [Hawkins (1990)] Number of nodes dominated [Hawkins (1994)] Number of phrasal nodes (i.e. maximal projections) dominated
[Rickford, et al (1995; 111)] Numbers of Examples HNPS DA V DO X 10,592 426 V X DO 694 615 TOTAL 11,286 1,041 V-Prt 496 1,205 1,701
Testing Adequacy of Categorial Definitions using HNPS % of heavy NPs that are shifted % of shifted NPs that are heavy 90 80 70 60 50 40 30 20 10 0 Ross
Emonds Weight Definition Erdmann Testing Categorical Definitions as Relative Criteria using HNPS % of cases with heavy NPs and light PPs exhibiting shifting % of shifted NPs that are heavy and follow light PPs 100 80 60 40 20
0 Ross Emonds Weight Definition Erdmann Weight Effects Increase Smoothly HNPS by NP Weight 100 80 60 40 20
0 1 2 3 4 5 6 weight in phrasal nodes 7
over 7 Weights of Both Constituents Matter in HNPS Mean Weights in HNPS (in phrasal nodes) V NP PP order 10 V PP NP order 8 6 4 2 0
NP PP Weights of Both Constituents Matter in DA Mean Weights in DA (in phrasal nodes) V DO to IO construction V IO DO construction 5 4 3
2 1 0 DO IO The Overlap of the Weight Measures Phrasal Nodes Nodes Words 100 80 60 40 20
0 91.4 89.9 88.4 90.6 HNPS 86.8 DA Construction
91 More on Overlap of Weight Measures Heavy NP Shift Words Nodes Phrasal Nodes <0 0 1 2
3 4 5 6 weight of NP minus weight of PP 7 More on Overlap of Weight Measures Dative Alternation
Words Nodes Phrasal Nodes <0 0 1 2 3 4
5 6 7 weight of direct object minus weight of indirect object Still More on Overlap of Weight Measures Verb-Particle Construction 100 80 60 Words Nodes
Phrasal Nodes 40 20 0 1 2 3 4 weight of NP 5
6 Correlation Coefficients for 3 Weight Measures HNPS DA V- Prt Words & Nodes .94 .96 .99 Words & Phrasal Nodes .96 .97 .95 Nodes & Phrasal Nodes .94 .96 .98 HNPS and Collocations
50 40 30 20 10 0 Transparent Collocations Non-collocations Two Verb Classes and HNPS Vt (for "transitive verbs") require NP objects in all their subcategorizations: bring, carry, make, place, put, set, take. Vp (for "prepositional verbs") can occur with NP objects but also have uses with an
immediately following PP and no NP object: add, build, call, draw, give, hold, leave, see, show, write. Predictions SPEAKER'S PERSPECTIVE PERSPECTIVE Vt HNPS rare Vp LISTENER'S HNPS relatively common HNPS relatively common HNPS very rare
Results from Brown Corpus 9.3 10 8 6 5.6 4 2 0 Vt Vp
Results from Switchboard Corpus 5 3.82 4 3 2 1.45 1 0 Vt
Vp Two Verb Classes and DA Vs(for sentential verbs") may be followed by an NP and that-clause or infinitval VP: offer, show, teach, tell, write Vn (for non-sentential verbs") may not be followed by an NP and that-clause or infinitval VP : assign, bring, give, hand, pay, send, take Predictions SPEAKER'S PERSPECTIVE PERSPECTIVE Vs double object relatively common
relatively rare Vn double object relatively rare relatively common LISTENER'S double object double object Corpus Results for DA Verb Classes DA Rates by Verb Class Vn Vs 100 75
50 Brown Corpus Switchboard Corpus End of material on weight Newness The Given-Before-New Principle, as formulated by Clark & Clark: Given information should appear before new information. Many variants in the literature.
Are weight and newness distinct effects? New information requires more more words to convey than old information (e.g., descriptions vs. pronouns) Is one of these factors just a side-effect of the other? Surprisingly, nobody asked this question until a few years ago.
Weight and newness are distinct. With my students, I conducted corpus analyses and a production experiment to tease weight and newness apart. Both methods showed the two factors were not reducible to one. Weight vs. Newness in Heavy NP Shift Corpus Study 100% NEW DO
GIVEN DO 80% 60% 40% 20% DO<PP 0%
DO>>PP Weight & Newness Arent the Whole Story On this side of the Atlantic, the Lancaster-Oslo/Bergen corpus was designed to replicate as closely as possible the Brown corpus, the only difference being that this corpus contains British rather than American English texts. Judith Klavans, Computational Linguistics, in W. OGrady, M. Dobrovolsky, & F. Katamba, Contemporary Linguistics: An Introduction Another Factor: Semantic Connectedness Behaghel again: das geistig eng Zusammengehrige auch eng
zusammengestellt wird Translation What belongs together mentally is also placed close together Collocations and Idioms Idioms (semantically opaque collocations): bring pressure to bear Semantically transparent collocations: bring the meeting to an end
Non-collocations: ...bring a pencil to the meeting Heavy NP Shift and Semantic Connectedness 70% 60% 50% 40% 30% 20% 10% 0%
Non-collocation Transparent Collocation Idiom Dependent vs. Independent Particles Dependent: They ate the cookies up. The meaning of up is dependent on the meaning of ate, since the cookies dont go up.
Independent: They picked the cookies up. The meaning of up is independent of the meaning of ate, since the cookies go up. Particle Position and Semantic Connectedness 75.0% V - OBJ - PRT V - PRT - OBJ 50.0% 25.0%
0.0% Independent Prt Dependent Prt Another Factor: Verb Bias 100% 75% 50% 25% 0% give
hand bring send sell Possible Explanations for Factors Influencing Order Variation Short before long is easier to process,
because hard tasks are postponed. Given before new facilitates efficient communication by establishing common ground. Possible Explanations for Factors Influencing Order Variation (continued) Long phrases and new information are hard to produce and thus get postponed. Choices in word order allows speakers
flexibility in production. Our memory for words includes information about what constructions they occur in and how frequently. Another Possible Factor: Ambiguity Avoidance Global ambiguity: I saw a man wearing an odd hat with a telescope. I saw with a telescope a man wearing an odd hat. Local ambiguity:
They gave Grants letters to Lincoln to a museum. They gave a museum Grants letters to Lincoln. Corpus Study of Global Ambiguity and Heavy NP Shifting 100% Unambiguous 80% Only syntactically ambiguous 60%
Fully ambiguous 40% 20% 0% -7 or less -4 to -6 -1 to -3 0 1-3
4-6 NP length minus PP length 7 or more Corpus Search for Local Ambiguity Few ambiguities of the relevant form (3) The company gave the U.S. rights to the drug to the Population Council More unambiguous word orders (56)
Giuliani gave the commissioner the ceremonial key to the city But all unambiguous cases are also cases of short-before-long. Experimental Method 1. Speaker silently reads a sentence: A museum received Grant's letters to Lincoln from the foundation.
LISTENER SPEAKER Experimental Method What did the foundation do? 2. Sentence disappears from screen. Listener reads question from list.
LISTENER SPEAKER Experimental Method 3. Speaker answers the listeners question. The foundation gave .... the museum, um, Grant's letter's to Lincoln. Listener chooses the correct response on list (from two
choices). LISTENER SPEAKER Experimental Results on Local Ambiguity 100% 90% 80% V-NP-NP 70% 60% V-NP-PP bias
V-NP-NP bias 50% 40% 30% 20% 10% 0% No potential local ambiguity Potential local ambiguity Implications of Experiment
Phrase ordering is not driven by an ambiguity avoidance strategy Instead, its influenced by lexical preferences and structural complexity There is a reverse ambiguity effect that needs to be explained. Why the reverse ambiguity effect? A Conjecture
Lexical/ Structural Priming -- the ambiguous stimuli have the same PP in the NP as is needed in the PP response: They received Grants letters to Lincoln from a foundation. They received Grants letters about Lincoln from a foundation. Why dont speakers use phrase ordering to avoid ambiguity?
Listeners have a very high tolerance for ambiguity Pragmatics rules out many ambiguities Prosody rules out others Still others just dont matter Conclusions A variety of syntactic, semantic, and discourse factors influence ordering. Some things you might think would influence ordering dont.
Questions for Future Research How strong is each of the various factors influencing ordering? Do ordering preferences in speech and writing differ? Why are people so tolerant of ambiguity?
Increasing & Articulating Political Awareness 'Before I served as a consultant to President Kennedy, I had believed that the process of decision-making was largely intellectual and all one had to do was walk into the President's office and convince him...
"Just as painters express ideas and feelings by arranging colors and images on a canvas, literary artists convey emotions and ideas through the skillful arrangement of words. ... The AP Vertical Teams Guide for English, 13. ... English Literary Terms...
Prehistoric man gathering honey. A rock painting, made around 6000 BC. La Arana shekter, Bicorp, Eastern Spain. In ancient Egypt, honey was considered "the nectar of the gods" Honey, an Ancient Medicine
The January 1, 1801 , an Italian monk, Giuseppe Piazzi (1746-1846), discovered a faint, nomadic object through his telescope in Palermo. Piazzi watched the object for 41 days but then fell ill, and shortly thereafter the wandering star star strayed...
A weight is simply a floating point number and it's these we adjust when we eventually come to train the network. * Neural networks A neuron can have any number of inputs from one to n, where n is the...
The Phil Inclusive Dev Agenda. 1) High and sustained economic growth - attain a high and sustained economic growth that provides productive employment opportunities 2) Equal access to development opportunities - a) better education, primary health care and nutrition and...
The F-scale History and development: from the A-S scale to the F-scale Authoritarianism Authoritarianism Conventionalism Anti-intraception Aggression Superstition and stereotypy Destructiveness and cynicism Power and toughness Projectivity and sex Intolerance of ambiguity Cognitive complexity Obedience and Submission Threat Intolerance of...
Ready to download the document? Go ahead and hit continue!