Acquisition of Lexical Knowledge from Ngram JHU summer

Acquisition of Lexical Knowledge from Ngram JHU summer

Acquisition of Lexical Knowledge from Ngram JHU summer workshop 2009 tutorial Satoshi Sekine New York University Why Acquisition of Lexical Knowledge? Russia: Top Official Is Fatally Shot in North Caucasus The highest-ranking law enforcement official in the Russian Republic of Dagestan and one of his deputies were fatally s hot when a gunman strafed their car with automatic weapon fire as they left a restaurant on Friday, Russias chief prosec utor announced. Interior Minister Adilgerei Magmed Tagirov, left, and his chief of logistics died in the hospital. The attack underlines continuing violence in the North Caucasus. Thou gh separatism has been suppressed in Chechnya, clashes between armed militants and authorities are reported on a w eekly basis in the neighboring republics of Dagestan and In gushetia. (From New York Times, Saturday, June 6. International section, page A6) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Why Acquisition of Lexical Knowledge? Lexical Knowledge is essential for semantic analysis.

We have to have knowledge which is not explicit in the sentences/words We have to create it in advance Semantic knowledge is huge We have huge corpus now We have to (semi-)automatically discover it (otherwise we have to create it by hand) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Outline 1. Background Why acquisition of Lexical Knowledge? 2. Kinds of Lexical Knowledge 3. General Technologies 1) 2) 3) 4) Distributional Similarity Lexico-syntactic Pattern Rewrite&Verify

Bootstrapping This slide (& more) is available at http://nlp.cs.nyu.edu/sekine/JHU2009 4. Tool: Ngram Search Engine 5. Task of the afternoon project 6. Bibliography Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Machine Learning In the past decade, we have had successfu l experiences using Machine Learning! FOR: POS tagger, NE tagger, Parser BY: HMM, DT, DL, MaxEnt, SVM, CRFs These tasks can be translated into labeling problem with a handful classes Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Supervised ML does not work on

Lexical Knowledge!! Limitations of ML in semantic problems because of Enormous number of classes (if we could enumerate) Limitation of training data It is a lexically dependent problem (sparseness) Hence, it is reasonable to believe that it is inherent rather than accidental that those semantic tasks are irreconcilable with supervised ML methods So, we want to acquire LK from a large corpus semi or un- supervised manner!! Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Outline 1. Background Why acquisition of Lexical Knowledge? 2. Kinds of Lexical Knowledge 3. General Technologies 1) 2) 3)

4) Distributional Similarity Lexico-syntactic Pattern Rewrite&Verify Bootstrapping 4. Tool: Ngram Search Engine 5. Task of the afternoon project 6. Bibliography Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Kinds of Knowledge (1/3) Named Entity Recognition Large sets of Named Entity (e.g. 200 kinds) Product name (I can not believe its not butter) Event name (the Cardiff Singer of the World competition) Ambiguity between different classes

(Toyota: Person, Company, City, Product, Awards) Coreference / Name alias Noun coref.: highest-ranking law enforcement official Interior Minister Name alias: Japan Tokyo Hafez Al-Assad Hafez Assad Hafez al-Assad Hafez el-Assad Attributes of name Sports team: Players, League,,,, Team Color, Mascot, Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Kinds of Knowledge (2/3) Pattern and meanings was ambushed (for attack event) Paraphrase gave his life to was killed at Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP

WordNet Satoshi Sekine Kinds of Knowledge (3/3) WSD attacked PERSON1: a politician => Verbal attack PERSON1: a robber => Physical attack WSD by domain He hits a victim (attack) He hits a ball (baseball) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Outline 1. Background Why acquisition of Lexical Knowledge? 2. Kinds of Lexical Knowledge 3. General Technologies 1) 2)

3) 4) Distributional Similarity Lexico-syntactic Pattern Rewrite&Verify Bootstrapping 4. Tool: Ngram Search Engine 5. Task of the afternoon project 6. Bibliography Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 1. Distributional Similarity Harriss distributional Hypothesis Harris, Z. 1985. Distributional structure. In: Katz, J. J. (ed.) The Philosophy of Linguistics. New York: Oxford University Press. pp.26-47. Words that occurred in the same contexts tend to be similar NSE Examples

24 President Clinton said yesterday 12 President Bush said yesterday 7 President Mandela said yesterday 6 President Ramos said yesterday 4 President Moi said yesterday 3 President Chen said yesterday 2 President , said yesterday 2 President Reagan said yesterday 2 President Arroyo said yesterday Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine

1 Distributional Similarity Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 1. Distributional Similarity (Sekine WWW06) Query log contains a lot of knowledge Discover knowledge about ENE Query log (6month, 1.5B) ENE dictionary (245K), 700 awards bathroom+showers+pictures miss+mundo owen+mulligan+tyrone news+press free+wedding+planner marc+arther+glen preadolescente+follando gokmenler+agricultural+machinery+co.+erotika+com academy+award+winner Tutorial at JHU Summer Workshop (NYU) 100+greatest+britons

AWARD 100+worst+britons AWARD aaass/orbis+books+prize+for+polish+studies AWARD abel+prize AWARD academy+award AWARD academy+awards AWARD acm+turing+award AWARD agatha+award AWARD agatha+awards AWARD air+medal AWARD Knowledge Discovery for NLP Satoshi Sekine Matrix: names & their context Noise Problem 2 This is noise in ENE dictionary. I want to get rid of them!!! Populate the list

Important Context Problem 1 This is a very general context. I want to see important context for this category Problem 3 The list may not have enough coverage. I want to Knowledge Discovery for NLP Tutorial at JHU Summer Workshop (NYU) populate ENE Satoshi Sekine Find important context Important Context (Problem 1) Important contexts for this category must occur more in this category than in average Score = number of co-occurred names normalized by the fre quency of the context in the entire query log Result ACADEMIC #+syllabus, degrees+in+#, phd+in+#, masters+in+#, diploma+i n+#, #+lecture+notes, masters+degree+in+#, courses+in+,

AWARD #+winners, #+nominees, #+nominations, #+winner, #+award, w ho+won+#, winners+of+#, list+of+#+winners, winners+of+the+# Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Delete noise & Find new entities Delete Noise (Problem 2) Entities which does not co-occur with important conte xt are noise BOOK: it, porno, space, night, we, working, candy, wheels, foundation, jazz, ghost, couples, the bible, giant, daddy, creation Find New entities (Problem 3) words co-occur with important context are entities AWARD: golden+globes, grammys, golden+globe, kentucky+derby, daytime+emmy, sag, sag+awards, american+idol, daytime+emmys Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine

1. Distributional Similarity (Dekang and Pantel KDD2001) DIRT - Discovery of Inference Rules from Text Try to find the following two means the same X find a solution to Y X solves Y Observation X find a solution to Y X solves Y Slot X Slot Y Slot X Slot Y communication strike committee problem government

problem he mystery government crisis government problem committee crisis clout crisis Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 1. Distributional Similarity (Lin and Pantel KDD2001) DIRT - Discovery of Inference Rules from Text

A large corpus (SUSANNE corpus/1G news) analyzed by dependen cy parser (Minipar) Extract certain form of paths (links) and two words N:subj:V<-find->V:obj:N->solution->N:to:N X finds solution to Y Find similarities of the links Count frequencies of SlotX, SlotY Use mutual information to compute similarity S(p1, p2) = sqrt( sim(SlotX1, SlotX2) x sim(slotY1, SlotY2) Top 20 most similar path to X solves Y Y is solved by X X finds a solution to Y X resolves Y X tries to solve Y X deals with Y Y is resolved by X Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP

Satoshi Sekine Pointwise Mutual Information Measure of association p(x, y): joint probability of x and y p(x), p(y): probability Example (N=1.9B) x=cute y=and f(x)=6549, f(y)=34925113, f(x,y)=587 SI(cute, and) = 0.585 X=cute y=baby f(x)=6549, f(y)=79491, f(x,y)=35 SI(cute, baby) = 2.106 X=Barack y=Obama, f(X)=85 , f(y)=184, f(x,y)=60 SI(Barack, Obama) = 6.84 Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 1. Distributional Similarity (Dekang and Pantel KDD2001) Evaluation on TREC Questions Top 40 candidates: 35~93% accuracy Comparison to manually created paraphrases Path Manual

DIRT Int. Accuracy X is author of Y 7 21 2 52.5% X manufactures Y 13 37 4 92.5% X spend Y 7 16

2 40.0% X asks Y 2 23 0 57.5% Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 2. Lexico-syntactic Pattern Use unambiguous context to identify semantic re lation Hypernym-hyponym Hearst 92) X, Y and other C, C such as X, Y and Z NSE

Relations between verbs (VerbOcean) Happen-before Xed and eventually Yed Causal relation X caused Y, Y, because of X Ideas for improvements Use syntactic structures Use multiple evidences Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 2 Lexico-Syntactic Pattern (Hearst COLING92) Use unambiguous context to identify semantic re lation Such NP as {NP ,}* {(or|and)} NP works by such authors as Herrick, Goldsmith, and Shakespeare. Hyponym(author, Herrick), Hyponym(author, Goldsmith) Hyponym(author, Shakespeare) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP

Satoshi Sekine 2 Lexico-Syntactic Pattern (Hearst COLING92) Patterns are created by hand 1. 2. 3. 4. 5. 6. NP such as {NP, NP , (and|or)}NP Such NP as {NP ,}* {(or|and)} NP NP {, NP}* or other NP NP {, NP}* {,} and other NP NP {,} including {NP ,}* {or|and} NP NP {,} especially {NP ,}* {or|and} NP Evaluation From 8.6M words encyclopedia corpus, 152 matche s to the patterns. 61 out of 106 feasible relations wer e found in WordNet. Tutorial at JHU Summer Workshop (NYU)

Knowledge Discovery for NLP Satoshi Sekine 2 Lexico-Syntactic Pattern Errors * and other cities 156 , and other cities 139 Moscow and other cities 120 Baghdad and other cities 115 York and other cities 112 Jakarta and other cities 90 Paris and other cities 83 capital and other cities 65 Beijing and other cities 3 China and other cities NSE

Osaka is competing with Beijing , China and other cities for the right to host the games Hong Kong is also working closely with the tourism promotion authorities in South China and other cities to enhance Hong Kong 's position as the gateway to the many attractions in the inland areas Use other patterns. e.g. 0 frequency: cities such as China Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 2 Lexico-Syntactic Pattern (Chklovski and Pantel EMNLP04) Relations between verbs (VerbOcean) 1. Similarity (produce::create) X ie Y, Xed and Yed 2. Strength (wound::kill) X even Y, X or at least Y, not only Xed but Yed 3. Antonymy (open::close) Either X or Y whether to X or Y, to X * but Y 4. Enablement (fight::win) Xed * by Ying the, to X * by Ying or

5. Happens-before (buy::own, marry::divorce) To X and then Y, Xed and eventually Yed Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP NSE Satoshi Sekine 2 Lexico-Syntactic Pattern (Chklovski and Pantel EMNLP04) Calculate score Sp(V1, V2) = p(V1,p,V2) / P(p)P(V1)P(V2) estimate by Google hits V1, V2: : a pair of verbs p: pattern (such as and eventually) Additional test for asymmetric relation (e.g. happenbefore) Sp(V1, V2)/ Sp(V2, V1) > 5 Result

Extract 29,165 associated verb pairs Human judgment accuracy = 65.5% Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 3. Rewrite and Verify (Nakov & Hearst 05) Check if alternative (rewritten expression) exists NP structure analysis Preposition (brain stem) cell OR brain (stem cell) stem cells in the brain -> right stem cells from the brain -> right cells from the brain stem -> left* Copula unmarked police car police car was unmarked -> left car was unmarked police -> right* Tutorial at JHU Summer Workshop (NYU)

NSE Knowledge Discovery for NLP Satoshi Sekine 3. Rewrite and Verify (Bergsma & Dekang ACL-HLT08) Check if alternative (rewritten expression) exists Non-referential pronoun recognition make it in advance (2) -> make them in advance (4) make it in Hollywood (24) -> make them in Hollywood* (0) NSE Evaluation 86.2% accuracy (70% baseline) His slides at http://www.cs.ualberta.ca/~bergsma/Pubs/Presentations/ ACL2008.bergsma.pdf Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine

4. Bootstrapping Get more of similar Set of names (i.e. Presidents) Clinton, Bush Putin, Chirac They must share something They share the same context in texts President * said yesterday of President * in President , * the President , * who The contexts may be shared by other Presidents .. Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 4. Bootstrapping Clinton Bush Putin Chirac

Use Ngram Search Engine to find contexts President * said yesterday of President * in President , * the President , * who Use Ngram Search Engine to find targets Boris Yeltsin Jiang Zemin Bill Clinton Saddam Hussein Tutorial at JHU Summer Workshop (NYU) Russian President * , President , * , U.S. President * , by President * . Knowledge Discovery for NLP Satoshi Sekine 4. Bootstrapping Target and context grow incrementally User gives several seeds as initial sample The acquired data will be used to acquire new data

Each time, add a small number of reliable data Combination of Distributional similarity and LSP Similar contexts give similar target Similar targets give similar context Some reliable contexts give a set of similar items Use two independent clues Set of President Set of Contexts for Presidents Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 4. Bootstrapping (Ravichandran and Hovy ACL2002) 1. 2. Select pair {Mozart, 1756}, and search the web Find frequent patterns which contains the pair 3. Mozart (1756, Mozart was born in 1756, Generalize the pattern

(, was born in 4. Find more examples, go back to 2. Use the pattern to find BY of given person (Q&A) Q&A performance (MRR): Birthday: 0.69, Inventor 0.58, Discoverer 0.88 Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Outline 1. Background Why acquisition of Lexical Knowledge? 2. Kinds of Lexical Knowledge 3. General Technologies 1)

2) 3) 4) Distributional Similarity Lexico-syntactic Pattern Rewrite&Verify Bootstrapping 4. Tool: Ngram Search Engine 5. Task of the afternoon project 6. Bibliography Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Discovery tool Challenges Most discoveries involve searching patterns in a large corpus Size is the problem -> takes long time, needs machine power Possible solutions Use commercial search engine (Google, Yahoo!, Live, API) Access limitation, context is limited Open Search Engine for NLP TSUBAKI (Shinzato et al. 08) Resource poor search engine

Only local context pattern is needed Ngram Search Engine (Sekine 08) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Discovery Tool Ngram Search Engine Search ngram with arbitrary wildcards 1B 7grams from 1.9B word 86 yr. amount of newspaper (no freq. cutoff) It also outputs KWIC, original sentences Search in 0.02 sec on a single CPU PC-linux with 4G memory 2.7TB index Demo Query Examples: NSE can not * * because of from * to * via * a * phone Mr. * said used * * to discover * Copyright issue Wikipedia

Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Statistics Google News86 Wikipedia 1T? 1.9B 1.7B Freq. threshold 30 0 0 Max N in Ngram 5

7 7 # of tokens # of 1gram 14M 5M 8M # of 2gram 314M 76M 93M # of 3gram 977M 351M 377M # of 4 gram

1,314M 760M 733M # of 5gram 1,176M 1,109M 1,006M # of 6 gram 1,330M 1,173M # of 7gram 1,449M 1,266M Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP

Satoshi Sekine Discovery Tool Advanced Pattern Search Engine Include literals, POS, NE and semantic categories and Accident in a give|gives|gave PRP$ life|lives to * * attended * in This is one of the tasks of our project Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP NSE Satoshi Sekine Outline 1. Background Why acquisition of Lexical Knowledge? 2. Kinds of Lexical Knowledge 3. General Technologies 1) 2) 3) 4)

Distributional Similarity Lexico-syntactic Pattern Rewrite&Verify Bootstrapping 4. Tool: Ngram Search Engine 5. Task of the afternoon project 6. Bibliography Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Homepage for the task http://nlp.cs.nyu.edu/sekine/JHU2009 Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Task of the class project Get Sets (similar functionality to Google Sets) With 3-5 seeds of a kind Clinton, Bush, Putin, Chirac Find more names of Presidents

Using a large corpus and the ngram search engine By Bootstrapping Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Bootstrapping Clinton Bush Putin Chirac Use Ngram Search Engine to find contexts President * said yesterday of President * in President , * the President , * who Use Ngram Search Engine to find targets Boris Yeltsin Jiang Zemin Bill Clinton Saddam Hussein

Tutorial at JHU Summer Workshop (NYU) Russian President * , President , * , U.S. President * , by President * . Knowledge Discovery for NLP Satoshi Sekine Sample output ### TARGETS at 0 ### Clinton Chirac Bush Putin ### CONTEXTS at 1 ### of President , 220.895737348797 to President , 200.381463415431 President , the 136.908340622199 President , in 100.538526776594 , President , 98.2705735567495 ### TARGETS at 1 ### Al Gore 71.7815596497228 Boris Yeltsin 58.5252737620003 Bill Clinton 49.6596526544608 Jiang Zemin 43.6024774532346 Saddam Hussein 39.6773013894499 Tutorial at JHU Summer Workshop

(NYU) ### CONTEXTS at 2 ### Russian President , 373.296574897326 President , who 225.443748458226 Vice President , the 156.253990355938 President , and 136.43082198116 Russian President , who 122.882295594623 ### TARGETS at 2 ### Vladimir Putin 188.792862073439 Jacques Chirac 84.9353249063858 Knowledge Discovery for NLP Satoshi Sekine Project Structure SET UP 6 Team (4-5 people in each team) competition! Sample program will be provided You will read the program for 15 minutes, then I will explain it. IMPROVEMENT You will improve the program for 2 hours.. Stop the implementation at 4:30pm! REPORT & EVALUATION Each team reports the experiment using a set of new seeds (send me your output - targets and context at each iteration Three seeds of new category (a type of people) will be given (!! you will be allowed to add two words of any types to the new seeds) You will run your program ONCE using the new seeds Report the result (top 50), evaluate (by another team) and find the winner!

Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Sample Program 500 lines of Perl 4 main subroutine Get contexts, score contexts Get targets, score targets 3 auxiliary subroutine Initialize search engine Run search engine (get frequency or matches) Available at http://nlp.cs.nyu.edu/sekine/JHU2009/set.pl Sample output http://nlp.cs.nyu.edu/sekine/JHU2009/out.txt http://nlp.cs.nyu.edu/sekine/JHU2009/out_short.txt Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Bibliography - 0

NSF Sponsored Symposium Semantic Knowledge Discovery, Organization and Use. November 14-15, at New York University http://nlp.cs.nyu.edu/sk-symposium Its videos at http://nlp.cs.nyu.edu/sk-symposium/video/ Dekang Lin, Ken Church, Satoshi Sekine. John Hopkins University, CLSP summer workshop Un supervised Acquisition of Lexical Knowledge from N-Grams http://www.clsp.jhu.edu/workshops/ws09/ Corpus based Knowledge Engineering Satoshi Sekine, Kentaro Inui, Kentaro Torisawa. Annual Meeting of Japanese NLP conference (2007) (in Japanese) Kens introduction of pragmatic tutorial on large corpus processing at http://people.sslmit.unibo.it/ ~baroni/compling04/UnixforPoets.pdf INTROP, ANC, FLaReNet, ALAGIN Thanks: Dekang Lin, Ken Church, Heng Li, David Yarowsky, Emily Pitler, Shane Bergsma, Ke ntaro Inui, Kentaro Torisawa, Ellen Riloff, Ralph Grishman THANK YOU! Tutorial at JHU Summer Workshop (NYU)

Knowledge Discovery for NLP Satoshi Sekine Bibliography - 1 Shuya Abe, Kentaro Inui , and Yuji Matsumoto , Acquiring Event Relation Knowledge by Learning Cooccurrence Patterns and Fertilizing Cooccurrence Samples with Verbal Nouns , In Proceedings of the 3rd International Joint Conference on Natural Language Processing, pp497-504, Jan. 2008 M. Ando, S. Sekine, S. Ishizaki. Automatic Extraction of Hyponyms from Japanese Newspapers, LREC 04. Barzilay, Regina and McKeown, Kathleen. 2001. Extracting paraphrases from a parallel corpus. In Proc. 39th Ann ual Meeting Association for Computational Linguistics (ACL-EACL01), pp 50-57 R. Barzilay, L. Lee, Learning to paraphrase: an unsupervised approach using multiple-sequence alignment, HLT -NAACL 03. S. Bergsman, Dekang Lin and Randy Goebel, Distributional Identification of Non-Referential Pronouns, ACL08 Berland, M. and E. Charniak, 1999. Finding parts in very large corpora. In Proceedings of ACL-1999. pp. 57-64. C ollege Park, MD. S. Brin, Extracting patterns and relations from world wide web WebDB98. Bill MacCartney, Christopher D. Manning. Natural Logic for Textual Inference (Workshop on Textual Entailment a

nd Paraphrasing 07) E. Charniak and M. Berland. 1999. Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the ACL, pages 57-64. T. Chklovski, P. Pantel, Verbocean: Mining the web for fine grained semantic verb relations EMNLP 04. K. Church P. Hanks, Word Association Norms, Mutual Information, and Lexicography, CL journal 90. Collins Michael and Singer, Y. (1999) Unsupervised Models for Named Entity Classification. Proc. of the Joint SIG DAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. Dmitry Davidov, Ari Rappoport. Moshe Koppel. Fully Unsupervised Discovery of Concept-Specific Relationships b y Web Mining (ACL 07) Doug Downey, Stefan Schoenmackers, and Oren Etzioni. Sparse Information Extraction: Unsupervised Languag e Models to the Rescue (ACL 2007) Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. Comprehensive Overview of KnowItAll (Artificial Intelligence, 2005) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Bibliography - 2

Fabian M. Suchanek, Gjergji Kasneci and Gerhard Weikum"YAGO - A Large Ontology from Wikip edia and WordNet", Elsevier Journal of Web Semantics, Vol. 6, No. 3. (2008), pp. 203-217. Fellb aum, C. 1998. WordNet: An Electronic Lexical Database. MIT Press. Girju, R.; Badulescu, A.; and Moldovan, D. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of HLT/NAACL-03. pp. 80-87. Edmonton, Cana da. Roxana Girju, Adriana Badulescu, and Dan Moldvan. Automatic discovery of part-whole relations. Computational Linguistics, Vol. 32, No. 1, pp. 83-135, 2006. R. Girju and M. Moldovan. 2002. Text mining for causal relations. In Proceedings of the FLAIRS Conference, pages 360-364. Hasegawa, S. Sekine, R. Grishman, Discovering Relations among Named Entities from Large C orpora, ACL 04. Hearst, M. 1992. Automatic acquisition of hyponyms from large text corpora. In COLING-92. pp. 539-545. Nantes, France. Idan Szpektor and Ido Dagan. 2008. Learning Entailment Rules for Unary Templates. In Proceedi ngs of COLING 2008. Idan Szpektor, Ido Dagan, Roy Bar-Haim and Jacob Goldberger. 2008. Contextual Preferences. I n Proceedings of ACL 2008. T. Inui, K. Inui, Y. Matsumoto, Acquiring causal knowledge from text using the connective marke r tame, TALIP 05. D. Lin, P. Pantel, Patrick, Dirt discovery of inference rules from text. ACM SIGKDD 01. Dekang Lin and Patrick Pantel. 2001. Discovery of inference rules for question answering. Natura l Language Engineering, 7(4):343-360. Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine

Bibliography - 3 C. Manning, H. Schutze, Foundations of Statistical Natural Language Processing, The MIT Pres s 01. Dan Moldovan, Adriana Badulescu, Marta Tatu, Daniel Antohe and Roxana Girju. Models for the Semantic Classification of Noun Phrases (Workshop on Computational Lexical Semantics) Nakov and Hearst Using the Web as an Implicit Training Using the Web as an Implicit Training S et: Application to Structural Ambiguity Resolution Set: Application to Structural Ambiguity Resoluti on, HLT/EMNLP 05. Patrick Pantel, Dekang Lin. Discovering Word Senses from Text. ACM SIGKDD international conf erence on Knowledge discovery and data mining, 2002, pp.613-619 Pantel, P.; Ravichandran, D.; Hovy, E.H. 2004. Towards terascale knowledge acquisition. In Proc eedings of COLING-04. pp. 771-777. Geneva, Switzerland. Patrick Pantel and Marco Pennacchiotti. 2006. Espresso: Leveraging Generic Patterns for Autom atically Harvesting Semantic Relations (COLING-ACL 2006) Ravichandran, D. and Hovy, E.H. 2002. Learning surface text patterns for a question answering s ystem. In Proceedings of ACL-2002. pp. 41-47. Philadelphia, PA. Riloff, E. and Shepherd, J. 1997. A corpus-based approach for building semantic lexicons. In Pro ceedings of EMNLP-1997. M. Saito, K. Yamamoto, S. Sekine, Using Phrasal Patterns to Identify Discourse Relations, HLT -NAACL 06.

Hiroki Sakaji, Satoshi Sekine, Shigeru Masuyama. Extracting Causal Knowledge Using Clue Phr ases and Syntactic Patterns, The 7th International Conference on Practical Aspects of Knowledg e Management ; 2008 Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Bibliography - 4 S. Sekine, A Linguistic Knoweldege Discovery Tool, COLING 08 S. Sekine, Extended Named Entity Ontology with Attribute Information, LREC 08 S. Sekine, On-Demand Information Extraction, COLING-ACL 06. S. Sekine, Automatic Paraphrase Discovery based on Context and Keywords between NE Pair s, IWP 05. Satoshi Sekine, Sofia Ananiadou, Jeremy Carroll and Jun-ichi Tsujii, Linguistic Knowledge Gene ration, COLING-92 Satoshi Sekine, Jeremy Carroll, Sofia Ananiadou, Jun-ichi, Automatic Learning for Semantic Coll

ocation, ANLP-92 Y. Shinyama, S. Sekine, Paraphrase Acquisition for Information Extraction, IWP 03. Shinyama, Yusuke and Sekine, S. (2004) Named Entity Discovery Using Comparable News Articl es. Proc. International Conference on Computational Linguistics. Keiji Shinzato, Tomohide Shibata, Daisuke Kawahara, Chikara Hashimoto and Sadao Kurohashi. TSUBAKI: An Open Search Engine Infrastructure for Developing New Information Access Metho dology In Proceedings of IJCNLP2008, pp.189-196, (2008.1). Rion Snow, Daniel Jurafsky and Andrew Y. Ng, Semantic Taxonomy Induction from Heterogenou s Evidence, COLING/ACL2006 K. Sudo, S. Sekine, R. Grishman, An Improved Extraction Pattern Representation Model for Aut omatic IE Pattern Acquisition, ACL-03. Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Bibliography - 5 K. Tokunaga, J. Kazama, K. Torisawa,, Automatic Discovery of Attribute Words from Web Docu ments, IJCNLP 05. Kentaro Torisawa, Automatic Acquisition of Expressions Representing Preparation and Utilization of an Object, In Proceedings of the Recent Advances in Natural Language Processing (RANLP0

5), pp. 556-560, Borovets, Bulgaria, Sept., 2005. K. Torisawa, Acquiring Inference Rules with Temporal Constraints by Using Japanese Coordinat ed Sentences and Noun-Verb Co-occurrences, HLT-NAACL 06. Yangarber, Roman; Lin, W. and Grishman, R. (2002) Unsupervised Learning of Generalized Nam es. Proc. of International Conference on Computational Linguistics. Minoru Yoshida, Kentaro Torisawa and Jun'ichi Tsujii, Extracting Attributes and Their Values fro m Web Pages, Chapter in a book titled Web Document Analysis: Challenges and Opportunities, A. Antonacopoulos and Jianying Hu, editors, Series in Machine Perception and Artificial Intelligen ce, World Scientific, pages 179-200 More at the following site (not very well maintained. Help needed) http://nlp.cs.nyu.edu/sekine/CBKE/reference.html Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Problem egg Sy n on ym The problem Understanding

Word hiera rchy Names se Paraphra W t rip c S SD Dialogue IR (Search IE by meaning) Tutorial at JHU Summer Workshop (NYU) TE Summarization Q&A

Knowledge Discovery for NLP Satoshi Sekine Kinds of Semantic Knowledge Target knowledge Thing Name Event Kinds of knowledge Same meaning Typology information Meronymy Description Relation Inter-relation Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine

Semantic Knowledge Matrix Same meaning Typology Meronymy Description Relation Synonym Hierarchy, h ypernym-hy ponym Part-of Adjective (Many) Name Name alias Membership (proper noun) Loc-of, Family,

Attribute Attribute Sub-event, Script Adverb Causal, Temporal , Thing (noun) Event (verb) Paraphrase Tutorial at JHU Summer Workshop (NYU) tropology Knowledge Discovery for NLP Satoshi Sekine Inter-target knowledge

Thing-Event Frame Information Extraction pattern Telic role, agentive role (generative lexicon) Name-Event Information Extraction result Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Discovery Tool (Japanese) http://languagecraft.jp Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine We need Semantic Knowledge! Challenge Semantic Knowledge (SK) is too diverse and vast to be created by a single academic institution Situation

It will take all my time until retirement to make all of this SK Someone may have created the knowledge I need Someone may have created the knowledge you need Someone may look for the knowledge you created Solution If all the knowledge created by the NLP people is available, I can retire now OR even better, I can start from the point of retirement (Extending my life span) Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine Community Effort! I propose a Community Effort for Semantic Knowledge 1) Discovery 1-a) Discovery Tool 1-b) Knowledge Discovery 2) Organization (Open Archive) 3) Use (Evaluation Event) Tutorial at JHU Summer Workshop (NYU)

Knowledge Discovery for NLP Satoshi Sekine 2) Organization Create Open Archive Objective: share (contribute and re-use) SK Should not be just a list of links Mechanism of rewards Easily record and display users experience, comments and appreciations (a.k.a. Web 2.0, e.g SourceForge.net) It has to be carefully maintained Find, Encourage and Recruit SK Categorize, Format Easy to search, browse Relation to LDC, ELRA Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 3) Use

Conduct Evaluation Events (such as ACE, TREC, TAC etc) The task would be something which uses SK IE, Textual Entailment, Q&A, Dialogue??? Encourage people to use SK Promote contributions of SK Collaborative Evaluation Current (competitive) evaluations have limitation Because of poor SK resources: Quick and dirty systems ML based systems with bag-of-word features 2 Ideas Participants start from a certain level using the SK archive Common training/test data annotated by many tools Tutorial at JHU Summer Workshop (NYU) Knowledge Discovery for NLP Satoshi Sekine 3) Use Common training/test NANTC (general) 500M words (LDC2008T16, $300 for non-member)

BLLIP parsed, add more and more information (NE, coref., relation) Wikipedia (http://nlp.cs.nyu.edu/wikipedia-data) Use a part of this for training, development and test for some evaluations Canadian River Expeditions offers a change From July 1 to August NNP NNP NNP VBZ DT NN IN NNP CD TO NNP Tutorial at JHU Summer Workshop (NYU) B-NP

I-NP I-NP B-VP B-NP I-NP B-PP B-NP I-NP B-PP B-NP B-GPE O O O O O O B-DATE I-DATE O B-DATE B-ORG I-ORG I-ORG O O O O B-MON I-DAY

O B-MON Knowledge Discovery for NLP B-NATIONALITY O O O O O B-DATE_RANGE I-DATE_RANGE I-DATE_RANGE I-DATE_RANGE I-DATE_RANGE Satoshi Sekine

Recently Viewed Presentations

  • The many uses of enriched thesauri and ontologies

    The many uses of enriched thesauri and ontologies

    Support for conceptual question analysis Browsing a hierarchy to find search concepts Mapping from users' terms to descriptors or free-text terms Behind-the-scenes query term expansion: hierarchic and synonym expansion Especially important for free-text searching Provide a tool for indexing, esp....
  • Discipline Systems that Work - University of South Florida

    Discipline Systems that Work - University of South Florida

    Discipline Systems that Work Florida's Second Annual Bully Prevention Conference Orlando, Florida April 17, 2007 Dr. George M. Batsche Co-Director Institute for School Reform School Psychology Program University of South Florida
  • Elementary Statistics: Picturing The World, 6e

    Elementary Statistics: Picturing The World, 6e

    Example: Using the Multiplication Rule to Find Probabilities More than 15,000 U.S. medical school seniors applied to residency programs in 2009. Of those, 93% were matched to a residency position.
  • Carnell Jones - Risk Programs Manager - InterPark Holdings ...

    Carnell Jones - Risk Programs Manager - InterPark Holdings ...

    An owner or occupier of land breaches its duty if: The condition was created by the negligence of the proprietor; OR. 2. The proprietor had actual notice of the dangerous condition; OR. 3. The proprietor had constructive notice of the...
  • on Dicti ary The sau Where do I

    on Dicti ary The sau Where do I

    Thesaurus. Almanac. Created by Jessi Olmsted. Note: To view this PPT correctly, go to . ... How many syllables are in a word. How many definitions there are of a word. The word's part of speech . How are dictionaries...
  • Example the bedford reader chapter 6

    Example the bedford reader chapter 6

    The thesis establishes the central idea of an essay, a generalization that is developed by examples. Examples. Example essays start with an example or two. You see something like a man stealing a quarter for bus fare from a child...
  • Presentación de PowerPoint - MunicipiosSTRC

    Presentación de PowerPoint - MunicipiosSTRC

    Un Marco para Pensar Éticamente. Este documento está diseñado como una introducción para pensar éticamente. Tenemos una idea de cómo somos cuando actuamos éticamente o somos lo mejor que podemos ser. Probablemente también tenemos una idea de lo que es...
  • More than Neighbours - TU Chemnitz

    More than Neighbours - TU Chemnitz

    Contributionsof Chemnitz. to 1.: Exchanging information and promoting the interaction . as well . as . promoting the language learning (English) and IT skills. Preparation of a handout about the facility "Senior College" at Chemnitz University of Technology