PageRenderTime 47ms CodeModel.GetById 17ms RepoModel.GetById 0ms app.codeStats 0ms

/projecte eclipse/TI/data/2011-documentos/27/2011-27-042.html

https://gitlab.com/bernagg/TI
HTML | 457 lines | 417 code | 40 blank | 0 comment | 0 complexity | 7e572de175032004e530de4011d953e3 MD5 | raw file
  1. <html>
  2. <head>
  3. <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  4. <title>CSE 628 Introduction to Natural Language Processing Fall 2010</title>
  5. </head>
  6. <body>
  7. <h1><center>CSE 628 Introduction to Natural Language Processing</center></h1>
  8. <h2><center>Fall 2010</center></h2>
  9. <h3><center>Mon <font color="red">11:40am</font> - 12:40pm, Wed <font
  10. color="red">11:10am</font> - 12:40pm at [CS-2129]</center></h3>
  11. <br>
  12. <h4>Instructor:</h4>
  13. <ul type="circle">
  14. <a href="http://www.cs.stonybrook.edu/~ychoi/">Yejin Choi</a>&nbsp;(email: ychoi@cs,
  15. office:
  16. CS-1422, <font color="blue"><b>Office Hours: Mon/Wed 2:30pm - 3:30pm</b></font>) <br>
  17. </ul>
  18. <h4>Course Description:</h4>
  19. <ul type="circle">
  20. <li>This course intends to provide a general introduction to Natural Language
  21. Processing (NLP), the study of computational systems to understand and/or
  22. generate human language.
  23. Approximately two thirds of the class will cover fundamental concepts and techniques in
  24. NLP, such as language models, statistical techniques for sequence tagging, parsing,
  25. information extraction, and information retrieval. The remaining one third of the class will introduce students to
  26. some of the more recent development in NLP research, such as the use of constraint
  27. optimization techniques, and ideas for data acquisition without expert
  28. annotation. Throughout the class, students will be exposed to some of the
  29. recent research papers that connect the concepts learned from the textbook to
  30. exciting research questions that are practically motivated.
  31. <br>
  32. <b>Note:</b> The material covered in this class will not overlap with the companion class [CSE 507 Computational Linguistics] that
  33. will be offered in Spring 2011; this class concerns subjects that are more
  34. computationally-oriented, while CSE 507 will touch more on the linquistic
  35. theories and semantics.
  36. <br><br>
  37. <li><b>Textbook: </b>
  38. Jurafsky and Martin, <a href="http://www.cs.colorado.edu/~martin/slp2.html"> SPEECH and LANGUAGE PROCESSING: An Introduction to
  39. Natural Language Processing, Computational Linguistics, and Speech Recognition
  40. , Second Edition</a>, McGraw Hill, 2008.
  41. <br><br>Note that some copies of the textbook should be available in the reserve
  42. shelf of the North Reading Room
  43. (NRR) at the library.
  44. <br><br>
  45. <li><b>Prerequisites: </b>
  46. Familiarity with either Artificial Intelligence or Machine
  47. Learning is strongly recommended, but not strictly necessary.
  48. <br><br>
  49. <li><b>Grading: </b>
  50. Paper Presentation 10%,
  51. Quiz (for paper presentations) 10%,
  52. Homework 35%,
  53. Final Project 35%,
  54. Class Participation 10%
  55. <br><br>
  56. <li> Late submission: Each student may adjust his/her homework deadline upto 7 days
  57. throughout the semester without a penalty. (not 7 days for each assignment,
  58. but 7 days cumulatively for the entire semester). After then, 10% of score will be
  59. subtracted each day. This policy is to encourage students to submit quality work,
  60. rather than poorly composed work in a hurry. For the purpose
  61. of counting late submission, factional values will be rounded up - for
  62. instance, late submission by 1 hour is counted as late by 1 day.
  63. If there are situations where the application of this rule can be ambiguous, I have
  64. the right to apply the rule as I see appropriate. If you have a doubt, consult with me
  65. first before making your own assumption.
  66. <br><br>
  67. <li> Assignments are posted to <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  68. Blackboard </a>
  69. </ul>
  70. <h4><font color="black">Announcements:</font></h4>
  71. <ol>
  72. <li> First class will be on Wednesday Sep 1.
  73. <li> This class is open to <i>both</i> Ph.D. and masters students.
  74. <li> The class location has moved to CS 2129. Due
  75. to the availability of the room, the Monday classes will begin at 11:40am and end
  76. at 12:40pm. The Wednesday classes are 11:10am - 12:40pm.
  77. <li> <font color="black"> Homework-1 (paper reading assignment) will be due Sep 19th
  78. 6pm. (See <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  79. Blackboard </a> for details) </font>
  80. <li> <font color="black"> Homework-2 (paper reading assignment) will be due Sep
  81. 28th (TUE) 11:59pm. (See <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  82. Blackboard </a> for details) </font>
  83. <li> <font color="black"> Homework-3 (project plan) will be due Oct 12th (TUE) 11:59pm. (See <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  84. Blackboard </a> for details) </font>
  85. <li> <font color="black"> Homework-4 (baseline implementation/data collection) will be due Oct
  86. 26th (TUE) 11:59pm. (See <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  87. Blackboard </a> for details) </font>
  88. <li> <font color="black"> Notice that your late submission credit has been
  89. increased by 2 days! </font>
  90. <li> <font color="black"> Homework-5 submission (substantial implementation of your
  91. project and in-depth analysis and evaluation) will be due Nov
  92. 23th (TUE) 11:59pm. (See <a href="https://blackboard.stonybrook.edu/bin/common/course.pl?course_id=_321486_1">
  93. Blackboard </a> for details) </font>
  94. <li> <b> <font color="red"> Homework-6 submission (Final Project Submission) will be due
  95. Dev 19th (SUN) 11:59pm. Submission guideline is similar to that of the previous
  96. submission. </font> </b>
  97. </ol>
  98. <h4>Syllabus: (<i><small>subject to change depending on the students' backgrounds and
  99. interests</i></small>)</h4>
  100. <table style="text-align: left; width: 100%;" border="1" cellpadding="2" style="border-collapse: collapse"
  101. bordercolor="#808080"
  102. cellspacing="2">
  103. <tbody>
  104. <tr bgcolor="black">
  105. <td width="30"></td>
  106. <td width="80"><font size=4 color="white"><b>Date</b></font></td>
  107. <td><font size=4 color="white"><b>Topics</b></font></td>
  108. <td><font size=4 color="white"><b>References</b></font></td>
  109. </tr>
  110. <tr bgcolor="lightcyan"><td>01</td>
  111. <td>Wed 09/01</td>
  112. <td>Introduction <a href="./lecture/01-intro.pdf">[slides]</a></td>
  113. <td><ul type="circle">
  114. <li>J&M Chapter 1
  115. <li><a href="http://www.cs.cornell.edu/home/llee/papers/cstb.pdf">[pdf]</a>
  116. Lillian Lee, 2001. <i>I'm
  117. sorry Dave, I'm afraid I can't do that: Linguistics, Statistics,
  118. and Natural Language Processing circa 2001.</i> The National
  119. Academies' study on the Fundamentals of Computer Science
  120. </ul></td>
  121. </tr>
  122. <tr bgcolor="lightcyan"><td>02</td>
  123. <td>Mon 09/06</td>
  124. <td>No Class (Labor Day)</td>
  125. <td></td>
  126. </tr>
  127. <tr bgcolor="lightcyan"><td>03</td>
  128. <td>Wed 09/08</td>
  129. <td>Language Models <a href="./lecture/02-ngram.pdf">[slides]</a></td>
  130. <td>
  131. <ul type="circle"><li>J&M Chapter 4</ul>
  132. </td>
  133. </tr>
  134. <tr bgcolor="lightcyan"><td>04</td>
  135. <td>Mon 09/13</td>
  136. <td>Language Models & Information Theory <a href="./lecture/03-ngram.pdf">[slides]</a></td>
  137. <td>
  138. <ul type="circle"><li>J&M Chapter 4
  139. <li>
  140. <a href="http://www.aclweb.org/anthology/P/P05/P05-1065.pdf">[pdf]</a>
  141. Sarah E. Schwarm and Mari Ostendorf, 2005.
  142. <i>Reading level assessment using support vector machines and statistical
  143. language models.</i> ACL
  144. <li>
  145. <a href="http://research.microsoft.com/en-us/um/people/kevynct/pubs/hlt04_readability.
  146. pdf">[pdf]</a>
  147. Kevyn Collins-Thompson and Jamie Callan, 2004. <i> A language modeling approach to predicting
  148. reading difficulty.</i> NAACL
  149. <li>
  150. <a href="http://www.ldc.upenn.edu/acl/A/A00/A00-2032.pdf">[pdf]</a>
  151. Rie Kubota Ando and Lillian Lee, 2000.
  152. <i>Mostly-Unsupervised Statistical Segmentation of Japanese: Applications to
  153. Kanji.</i> NAACL
  154. </td>
  155. </tr>
  156. <tr bgcolor="lightcyan"><td>05</td>
  157. <td>Wed 09/15</td>
  158. <td>Machine Learning Basics <a href="./lecture/04h.pdf">[slides]</a></td>
  159. <td><ul type="circle">
  160. </ul></td>
  161. </tr>
  162. <tr bgcolor="lightcyan"><td>06</td>
  163. <td>Mon 09/20</td>
  164. <td>Part-of-Speech Tagging & Sequence Tagging <a href="./lecture/05h.pdf">[slides]</a></td>
  165. <td>
  166. <ul type="circle"><li>J&M Chapter 5</ul>
  167. </td>
  168. </tr>
  169. <tr bgcolor="lightcyan"><td>07</td>
  170. <td>Wed 09/22</td>
  171. <td>Part-of-Speech Tagging & Sequence Tagging <a href="./lecture/05h.pdf">[slides]</a></td>
  172. <td>
  173. <ul type="circle"><li>J&M Chapter 5</ul>
  174. </td>
  175. </tr>
  176. <tr bgcolor="lightcyan"><td>08</td>
  177. <td>Mon 09/27</td>
  178. <td>Hidden Markov Models <a href="./lecture/06h.pdf">[slides]</a></td>
  179. <td>
  180. <ul type="circle"><li>J&M Chapter 6</ul>
  181. </td>
  182. </tr>
  183. <tr bgcolor="lightcyan"><td>09</td>
  184. <td>Wed 09/29</td>
  185. <td>Hidden Markov Models <a href="./lecture/06h.pdf">[slides]</a></td>
  186. <td>
  187. <ul type="circle"><li>J&M Chapter 6</ul>
  188. </td>
  189. </tr>
  190. <tr bgcolor="lightcyan"><td>10</td>
  191. <td>Mon 10/04</td>
  192. <td>2nd Assignment & Project Discussion</td>
  193. <td>
  194. <ul type="circle"><li>
  195. See blackboards for slides
  196. </ul>
  197. </td>
  198. </tr>
  199. <tr bgcolor="lightcyan"><td>11</td>
  200. <td>Wed 10/06</td>
  201. <td>Project Meeting</td>
  202. <td></td>
  203. </tr>
  204. <tr bgcolor="lightcyan"><td>12</td>
  205. <td>Mon 10/11</td>
  206. <td>Recruiting Event</td>
  207. <td>
  208. </td>
  209. </tr>
  210. <tr bgcolor="lightcyan"><td>13</td>
  211. <td>Wed 10/13</td>
  212. <td>Maximum Entropy Models & Conditional Random Fields <a href="./lecture/07h.pdf">[slides]</a></td>
  213. <td>
  214. <ul type="circle">
  215. <li>J&M Chapter 6
  216. </ul>
  217. </td>
  218. </tr>
  219. <tr bgcolor="lightcyan"><td>14</td>
  220. <td>Mon 10/18</td>
  221. <td>Information Extraction <a href="./lecture/08h.pdf">[slides]</a></td>
  222. <td><ul type="circle">
  223. <li>J&M Chapter 22
  224. </ul>
  225. </td>
  226. </tr>
  227. <tr bgcolor="lightcyan"><td>15</td>
  228. <td>Wed 10/20</td>
  229. <td>Paper Presentations</td>
  230. <td>
  231. <ul type="circle">
  232. <li>Balanagireddy Mudiam to present [<i>Large-Scale Named Entity Disambiguation Based on
  233. Wikipedia Data </i> by Silviu Cucerzan]
  234. <a href="http://acl.ldc.upenn.edu/D/D07/D07-1074.pdf">[pdf]</a>
  235. <a href="./students/bala.pdf">[slides]</a>
  236. <li>Ruchita Sarawgi, Kailash GVS, Praveen V S to present
  237. [<i>Automatically profiling the Author of an Anonymous
  238. Text</i> by Shlomo Argamon, Moshe Koppel, James W. Pennebaker, and Jonathan Schler]
  239. <a href="http://www.cs.biu.ac.il/~koppel/papers/AuthorshipProfiling-cacm-final.pdf">[pdf]</a>
  240. <a href="./students/ruch.ppt">[slides]</a>
  241. </ul>
  242. </td>
  243. </tr>
  244. <tr bgcolor="lightcyan"><td>16</td>
  245. <td>Mon 10/25</td>
  246. <td>Context Free Grammars <a href="./lecture/09h.pdf">[slides]</a></td>
  247. <td>
  248. <ul type="circle"><li>J&M Chapter 12
  249. </ul>
  250. </td>
  251. </tr>
  252. <tr bgcolor="lightcyan"><td>17</td>
  253. <td>Wed 10/27</td>
  254. <td>Paper Presentations</td>
  255. <td>
  256. <ul type="circle">
  257. <li> Ritwik Banerjee to present [<i>A Non-negative Matrix Tri-factorization Approach to Sentiment Classification
  258. with Lexical Prior Knowledge</i> by Tao Li, Yi Zhang, and Vikas Sindhwani]
  259. <a href="http://www.aclweb.org/anthology/P/P09/P09-1028.pdf"> [pdf]</a>
  260. <a href="./students/ritw.pdf">[slides]</a>
  261. <li>Thomas J Condus to present [<i>A Hierarchical Approach to Encoding Medical Concepts for Clinical
  262. Notes</i> by Yitao Zhang] <a href="http://www.aclweb.org/anthology/P/P08/P08-3012.pdf">[pdf]</a>
  263. <a href="./students/thom.ppt">[slides]</a>
  264. </ul>
  265. </td>
  266. </tr>
  267. <tr bgcolor="lightcyan"><td>18</td>
  268. <td>Mon 11/01</td>
  269. <td>Parsing <a href="./lecture/10h.pdf">[slides]</a></td>
  270. <td>
  271. <ul type="circle"><li>J&M Chapter 13
  272. </ul>
  273. </td>
  274. </tr>
  275. <tr bgcolor="lightcyan"><td>19</td>
  276. <td>Wed 11/03</td>
  277. <td>Paper Presentations</td>
  278. <td>
  279. <ul type="circle">
  280. <li>
  281. Deepika Srinivasan, Annapurneshwari Kulkarni to present
  282. [<i>Annotating Named Entities in Twitter Data with Crowdsourcing</i> by
  283. Tim Finin, Will Murnane, Anand Karandikar, Nicholas Keller, Justin Martineau, and Mark Dredze]
  284. <a href="http://www.cs.jhu.edu/~mdredze/publications/amt_ner.pdf">[pdf]</a>
  285. <a href="./students/deep.ppt">[slides]</a>
  286. <li>Ishani Garg, Sumati Priya to present [<i>Robust Sentiment Detection on Twitter from Biased and Noisy
  287. Data</i> by Luciano Barbosa, and Junlan Feng]
  288. <a
  289. href="http://www2.research.att.com/~lbarbosa/publications/coling_2010.pdf">[pdf]</a>
  290. <a href="./students/isha.pdf">[slides]</a>
  291. <li>Mandeep Singh Grang, Sudheer Jetty to present
  292. [<i>NLP (Natural Language Processing) for NLP (Natural Language
  293. Programming)</i> by Rada Mihalcea, Hugo Liu, and Henry Lieberman] <a
  294. href="http://www.cse.unt.edu/~rada/papers/mihalcea.cicling06a.pdf">[pdf]</a>
  295. <a href="./students/mand.pdf">[slides]</a>
  296. </ul>
  297. </td>
  298. </tr>
  299. <tr bgcolor="lightcyan"><td>20</td>
  300. <td>Mon 11/08</td>
  301. <td>Paper Presentations</td>
  302. <td>
  303. <ul type="circle">
  304. <li>Rohith Menon, Goutham Bhat and Shruthi D to present
  305. [<i>Authorship Attribution Using Probabilistic Context-Free Grammars</i> by Sindhu Raghavan, Adriana Kovashka, and Raymond Mooney]
  306. <a href="http://www.aclweb.org/anthology/P/P10/P10-2008.pdf">[pdf]</a>
  307. </ul>
  308. </td>
  309. </tr>
  310. <tr bgcolor="lightcyan"><td>21</td>
  311. <td>Wed 11/10</td>
  312. <td>Paper Presentations</td>
  313. <td>
  314. <ul type="circle">
  315. <li>
  316. Manoj Harpalani, Sandesh Singh to present
  317. [<i>Learning within sentence semantic coherence</i> by Elena Eneva, Rose
  318. Hoberman, and Lucian Lita]
  319. <a href="http://www.aclweb.org/anthology/W/W01/W01-0503.pdf">[pdf]</a>
  320. <li>Chandrakanth Reddy B, Anand sagar Kothapalli to present
  321. [<i>An Approach for Combining Content-based and Collaborative Filters</i> by
  322. Qing Li, and Byeong Man Kim]
  323. <li>Harit Himanshu to present [<i>Thumbs up? Sentiment Classification using Machine Learning
  324. Techniques</i> by Bo Pang, and Lillian Lee]
  325. <a href="http://www.cs.cornell.edu/home/llee/papers/sentiment.pdf">[pdf]</a>
  326. </ul>
  327. </td>
  328. </tr>
  329. <tr bgcolor="lightcyan"><td>22</td>
  330. <td>Mon 11/15</td>
  331. <td>Paper Presentations <br> & <br> Statistical Parsing <a href="./lecture/11h.pdf">[slides]</a></td>
  332. <td>
  333. <ul type="circle">
  334. <li>Parag Kadu & Aneesh Ali to present [<i>Sentence Boundary Detection and the Problem with the
  335. U.S.</i> by Dan Gillick] <a
  336. href="http://www.aclweb.org/anthology/N/N09/N09-2061.pdf">[pdf]</a>
  337. <li>J&M Chapter 14
  338. </ul>
  339. </td>
  340. </tr>
  341. <tr bgcolor="lightcyan"><td>23</td>
  342. <td>Wed 11/17</td>
  343. <td>Paper Presentations</td>
  344. <td>
  345. <ul type="circle">
  346. <li>Vaibhav Shrivastava & Girish SK to present
  347. [<i>Hierarchical Document Categorization with Support Vector
  348. Machines</i> by Lijuan Cai and Thomas Hofmann.]
  349. <a href="http://sca2002.cs.brown.edu/people/th/papers/CaiHof-CIKM2004.pdf">[pdf]</a>
  350. <li>Amitha Cheluvagopal & Sharath Ravindran & Avinash Gupta Konda to
  351. present [<i>An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named
  352. Entity Recognition</i> by Vijay Krishnan and Christopher Manning]
  353. <a href="http://www.aclweb.org/anthology/P/P06/P06-1141.pdf">[pdf]</a>
  354. </ul>
  355. </td>
  356. </tr>
  357. <tr bgcolor="lightcyan"><td>24</td>
  358. <td>Mon 11/22</td>
  359. <td>Paper Presentations</td>
  360. <td>
  361. <ul type="circle"><li>Siming Li & Girish Kulkarni to present [<i>A Linear Programming Formulation for Global Inference
  362. in Natural Language Tasks</i> by Dan Roth, and Wen-tau Yih]
  363. <a href="http://l2r.cs.uiuc.edu/~danr/Papers/RothYi04a.pdf">[pdf]</a>
  364. <li>Longfei Xing & Supriya Vasudevan to present
  365. [<i>Graph-based Ranking Algorithms for Sentence Extraction,
  366. Applied to Text Summarization</i> by Rada Mihalcea]
  367. <a href="http://acl.ldc.upenn.edu/P/P04/P04-3020.pdf">[pdf]</a>
  368. </ul>
  369. </td>
  370. </tr>
  371. <tr bgcolor="lightcyan"><td>25</td>
  372. <td>Wed 11/24</td>
  373. <td>Correction Day - Follows a FRIDAY schedule </td>
  374. <td></td>
  375. </tr>
  376. <tr bgcolor="lightcyan"><td>26</td>
  377. <td>Mon 11/29</td>
  378. <td>Statistical Parsing <a href="./lecture/11h.pdf">[slides]</a></td>
  379. <td><ul type="circle">
  380. <li>J&M Chapter 14
  381. </ul>
  382. </td>
  383. </tr>
  384. <tr bgcolor="lightcyan"><td>27</td>
  385. <td>Wed 12/01</td>
  386. <td>Paper Presentations <br> & <br> Machine Translation <a
  387. href="./lecture/12h.pdf">[slides]</a></td>
  388. <td><ul type="circle">
  389. <li>Khusboo Agarwal & Piyush Kumat to present
  390. [<i>NaLIX: an Interactive Natural Language Interface for Querying XML</i> by
  391. Yunyao Li, Huahai Yang, and H. V. Jagadish]
  392. <a href="http://www-personal.umich.edu/~yunyaol/publication/130NaLIX.pdf">[pdf]</a>
  393. <li>J&M Chapter 25</ul>
  394. </td>
  395. </tr>
  396. <tr bgcolor="lightcyan"><td>28</td>
  397. <td>Mon 12/06</td>
  398. <td>Machine Translation <a href="./lecture/12h.pdf">[slides]</a></td>
  399. <td><ul type="circle">
  400. <li>J&M Chapter 25</ul>
  401. </td>
  402. </tr>
  403. <tr bgcolor="lightcyan"><td>29</td>
  404. <td>Wed 12/08</td>
  405. <td>Expectation Maximization <br> & <br> How to be successful in your
  406. future career</td>
  407. <td></td>
  408. </tr>
  409. </tbody>
  410. </table>
  411. <br>
  412. <br>
  413. <br>
  414. <br>
  415. <br>
  416. </body>
  417. </html>