Machine Translation

J&M Chapter 11
- Note that we’ll be splitting our coverage of this chapter between today and next Wednesday; for today, make sure to have read up to section 11.2, as well as sections 11.7, 11.8, and 11.9.
Papineni et al. (2002). BLEU: a method for automatic evaluation of machine translation. in Proc ACL 2002.
Tatman (2019). Evaluating Text Output in NLP: BLEU at your own risk. Blog post, Jan. 15, 2019
Reiter (2020). Why Do We Stil Use 18-Year Old BLEU?. Blog post, Mar 2, 22020
Anonymous. (1968). Mokusatsu: One Word, Two Lessons. NSA Technical Journal, 13(4), 95–100.

Here is a fun newsreel clip about the 1954 IBM-Georgetown system
Warren Weaver’s “original memorandum from 1945” on machine translation is remarkable, and is worth reading for several reasons:
- It is incredibly well-written, both in terms of its organization and its rhetoric
  - (though do take note of a few minor passages that have not aged particularly well- science in the 1940s)
- He outlines the general shape of the rule-based MT methods that would be standard for years, and also clearly lays out their limitations
- He makes the link between Shannon’s noisy-channel model and the translation problem (about forty years before the computational and data resources made it practical to actually try)
- He also foresees that McCulloch & Pitts’s ideas would be crucial (about sixty years early)
- His idea of using adaptive context windows for word-sense disambiguation was spot-on
“Oh, yes, everything’s right on schedule, Fred”: Keynote from the “Twenty Years of Bitext” workshop at EMNLP 2013, by Peter Brown and Robert Mercer, on the history of early statistical machine translation.
- As is the case any time Robert Mercer comes up, it is important to be aware of the harmful ways in which he has chosen to use his fortune and position of power.

Eisenstein Chapter 18, section 2
K. Knight. 1999. Decoding complexity in word-replacement translation models. Computational Linguistics 25(4): 607-615. (Focus on pages 607-611.)
D. Chiang. 2005. Hierarchical phrase-based translation. Computational Linguistics 33(2): 201-228.

K Knight. A Statistical MT Tutorial Workbook., August 1999.
- You may find this very helpful with the assignment, especially section 27!
  - Note that you’ll need to read more than just that section in order for it to be helpful… ;-D
Lopez, A. (2008). Statistical Machine Translation. ACM Comput. Surv., 40(3).
- Adam Lopez’s excellent overview of traditional statistical MT is very readable and useful.
Lectures 2, 3, and 4 from the 2015 iteration of Stanford’s CS224 are excellent resources