PageRenderTime 24ms CodeModel.GetById 16ms RepoModel.GetById 0ms app.codeStats 0ms

/examples/luatex/non-standard-hyphenation-german/non-standard-hyphenation-german.tex

https://gitlab.com/sh2d/padrinoma
LaTeX | 258 lines | 229 code | 19 blank | 10 comment | 0 complexity | f012beb89a6ab1fb2af1f34c649c0d46 MD5 | raw file
Possible License(s): AGPL-3.0
  1. % -*- coding: utf-8 -*-
  2. \listfiles
  3. \documentclass{article}
  4. \usepackage{fontspec}
  5. \usepackage{multicol}
  6. \usepackage[main=english, german]{babel}
  7. \usepackage{pdnm_nonstd-hyph-de-1901}
  8. \begin{document}
  9. \section{Automatic non-standard hyphenation with Lua\TeX}
  10. This is an example application of the padrinoma package illustrating
  11. automatic German non-standard hyphenation with Lua\TeX. No explicit
  12. non-standard hyphenation mark-up has been used in the source document.
  13. \subsection{German non-standard hyphenation rules}
  14. Besides standard hyphenation, in traditional German orthography, there
  15. are two non-standard hyphenation rules that call for more complex
  16. operations than the mere insertion of a hyphen at the end of a line:
  17. \begin{description}
  18. \item[ck-rule] When hyphenated, letters \emph{ck} turn into
  19. \emph{k-k}:\par
  20. \begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
  21. \emph{Dackel} vs. \emph{Dak-kel} & (dachshund)\\
  22. \end{tabular*}
  23. \item [triple consonant rule] In compound words, of three equal adjacent
  24. consonants followed by a vowel, one consonant is cancelled out. As an
  25. example, the words \emph{Schiff} and \emph{Fahrt} can be combined to
  26. the word \emph{Schiffahrt}. The second part of this rule says that
  27. during hyphenation, cancelled consonants reappear:\par
  28. \begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
  29. \emph{Schiffahrt} vs. \emph{Schiff-fahrt} & (shipping)\\
  30. \end{tabular*}
  31. \end{description}
  32. But note, not every \emph{ck} is subject to hyphenation and the pattern
  33. word boundary with double consonant followed by a vowel indeed rarely
  34. stems from application of the three consonant rule:\strut\par
  35. \noindent\begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
  36. \emph{Steck-dose}, not Stek-kdose & (socket)\\
  37. \emph{Rohstoff-industrie}, not Rohstoff-findustrie & (extractive industry)\\
  38. \end{tabular*}
  39. \subsection{Handling non-standard hyphenation manually}
  40. Traditional \TeX\ provides the \verb+\discretionary+ command for manual
  41. handling of non-standard hyphenation. Using this command, the German
  42. word \emph{Dackel} can be typed-in as
  43. \verb+Da\discretionary{k-}{}{c}kel+. The Babel and Polyglossia packages
  44. provide short-cuts so that one can also type \verb+Da"ckel+. But the
  45. disadvantages of these solutions are that, in languages demanding
  46. non-standard hyphenation, authors need to manually apply non-standard
  47. hyphenation mark-up throughout a text, which makes a source document
  48. look cluttered and---more important---application of the mark-up demands
  49. mental attention and distracts from the actual task of writing.
  50. With Lua\TeX, the situation improved. Hyphenation exceptions may now
  51. contain \verb+\discretionary+ commands, like
  52. \verb+\hyphenation{Da{k-}{}{c}kel}+, so that non-standard hyphenation
  53. mark-up isn't needed anymore for the words handled in the preamble of a
  54. document. But still, that way, only words maintained in an explicit
  55. list become subject to non-standard hyphenation.
  56. \subsection{Automatic non-standard hyphenation}
  57. The \texttt{padrinoma} package is an attempt to solve the
  58. above-mentioned problems. The following sections contain a selection of
  59. German words and named entities (changing at every compilation run)
  60. where non-standard hyphenation may or may not be desirable and is
  61. applied automatically. The document source contains neither mark-up nor
  62. hyphenation exceptions. Discretionaries are inserted fully
  63. automatically at Lua\TeX's node level at positions indicated by
  64. dedicated non-standard hyphenation patterns. Patterns are read from
  65. file
  66. \begin{center}
  67. \verb+examples/patterns/hyph-de-1901-nonstd.pat.txt+
  68. \end{center}
  69. A list of fully hyphenated words can be found in the \verb+log+ file.
  70. For debugging purposes, a list of all words with non-standard
  71. hyphenation patterns only applied is written to a file with the name of
  72. the file patterns were read from augmented by the extension
  73. \verb+.spots+. A hyphen indicates where non-standard hyphenation is to
  74. be applied within a word, like \verb+Dac-kel+ or \verb+Schif-fahrt+.
  75. Wrong matches, such as \verb+Rebec-ka+ (non-standard hyphenation is
  76. discouraged within names) or \verb+Schlep-panker+, are due to still
  77. imperfect non-standard hyphenation patterns. Patterns are subject to
  78. further improvements. If you find examples of wrong German non-standard
  79. hyphenation with the current pattern set, please send them to
  80. \verb+trennmuster@dante.de+.
  81. % # is used in Lua code.
  82. \catcode`\#=12
  83. \directlua{
  84. words_ck = {
  85. % random words
  86. 'Acker', 'Ackerböschung', 'Attacke', 'auflockern',
  87. 'backen', 'Backstube', 'Bestecke', 'Bestecks', 'blicken',
  88. 'Cricket', 'Kricket',
  89. 'Dackel', 'Deckel', 'Dickicht', 'dreckig', 'dreieckig', 'drucken',
  90. 'Ecke', 'eckig', 'entzückend', 'entzückt', 'erschrickst',
  91. 'flackern', 'Fleck', 'fleckige', 'flicken', 'Flickschuster',
  92. 'Flocken',
  93. 'Gegacker', 'Glöckchen', 'Glocke', 'gluckern',
  94. 'Hacken', 'Hecke', 'Heckmeck',
  95. 'Hockey',
  96. 'Jockey',
  97. 'Lackschaden', 'Leck', 'lecker', 'lockig', 'Lücke',
  98. 'packen', 'packst', 'packt', 'pflücken', 'Pickel', 'Pickelhaube',
  99. 'Pickhacke', 'Picknickkorb',
  100. 'recken', 'reckst', 'Reckstange', 'Röcke', 'Rucksack', 'Rücken',
  101. 'Rückerstattung',
  102. 'Sack', 'schlecken', 'schmecke', 'Schnecke', 'Schockwelle',
  103. 'Socken', 'Steckdose', 'stecken',
  104. 'Stockfisch', 'Stockwerk', 'strecken', 'streckst', 'Strecksprung',
  105. 'Streckung', 'Stuckateur', 'Stücke', 'Stücken', 'Stückchen',
  106. 'stückweise',
  107. 'Weckruf', 'Wicke', 'Wicklung',
  108. 'zickig', 'Zickzack', 'Zucker', 'zweckmäßig',
  109. 'Zuckerbäcker', 'Zucker-Bäcker',
  110. % systematic letter combinations
  111. 'Blockade', 'Deckadresse',%cka
  112. 'Drucker', 'Abdruckerlaubnis',%cke
  113. 'lockig', 'Schmuckindustrie',%cki
  114. 'Backofen',%cko
  115. 'Entdeckung', 'Druckunterschied',%cku
  116. 'Strickjacke',%ckj
  117. 'Druckänderung',%ckä
  118. 'Backöfen',%ckö
  119. 'Rückübertragung',%ckü
  120. % named entities
  121. 'Deckert',
  122. 'Eckart', 'Eckehard', 'Eckehardt', 'Eckhard', 'Eckhardt',
  123. 'Eckernförde',
  124. 'Fricke', 'Fricktal',
  125. 'Glienicke',
  126. 'Hendricks',
  127. 'Hockenheim',
  128. 'Huckleberry',
  129. 'Innsbruck', 'Innsbrucker', 'Innsbrucks',
  130. 'Knickerbocker',
  131. 'Kuckuck', 'Kuckucks',
  132. 'Lübeck', 'Lübecker', 'Lübecks', 'Lübbecke',
  133. 'Luckenwalde', 'Luckner',
  134. 'Mackie Messer',
  135. 'Mecklenburg',
  136. 'Mount McKinley',
  137. 'Neckar', 'Neckarsulm',
  138. 'Osnabrück', 'Osnabrücker', 'Osnabrücks',
  139. 'Packard',
  140. 'Pückler',
  141. 'Rebecka',
  142. 'Recklinghausen', 'Recknagel',
  143. 'Rockefeller', 'Rockford', 'Rocky Mountains',
  144. 'Rostock', 'Rostocker', 'Rostocks',
  145. 'Saarbrücken',
  146. 'Schmöckwitz',
  147. 'Schreckenberg',
  148. 'Schweickhardt',
  149. 'Senckenberg',
  150. 'Sickingen',
  151. 'Stockerau', 'Stockhausen', 'Stockholm',
  152. 'Tresckow',
  153. 'Uckermark', 'Ueckermünde',
  154. 'Weitzsäcker', 'Weizsäcker',
  155. 'Winckelmann',
  156. 'Yorck', 'Yorcks', 'Yorckscher',
  157. 'Zweibrücken',
  158. 'Zwickau',
  159. }
  160. words_triple = {
  161. % triple consonant words
  162. 'Baustoffabrik',
  163. 'Schiffahrt',
  164. 'Kunststoffasern',
  165. 'Kunststoffenster',
  166. 'Baustoffirma',
  167. 'Zellstoffirma',
  168. 'schifförmig',
  169. 'Geröllawine',
  170. 'Metallegierung',
  171. 'Abfülleistung',
  172. 'Nullösung',
  173. 'Dämmaßnahme',
  174. 'Klemmechanismus',
  175. 'Kammolch',
  176. 'Brennessel', 'Brennessel-Tee',
  177. 'Kennummer',
  178. 'Kreppapier',
  179. 'Kippunkt',
  180. 'Ballettanz',
  181. 'Schrottank',
  182. 'Fetteilchens',
  183. 'Schnittiefe',
  184. % non-triple consonant words
  185. 'Baustoffindustrie',
  186. 'Brotteig',
  187. 'Knettiere',
  188. 'Kunststoffascher',
  189. 'Kunststoffindustrie',
  190. 'Kunststofflasche',
  191. 'Schleppanker',
  192. 'Schrottanker',
  193. 'Bettruhe',
  194. }
  195. function paragraphs(words, num_p, num_w)
  196. for i = 1,num_p do
  197. for w = 1,num_w do
  198. tex.print(words[math.random(#words)] .. ' ')
  199. end
  200. tex.print('\par')
  201. end
  202. end
  203. math.randomseed(os.time())
  204. }
  205. \subsection{Non-standard hyphenation examples}
  206. \begin{multicols}{5}[\subsubsection{\emph{ck} rule}]
  207. \selectlanguage{german}
  208. % Make hyphenation desirable.
  209. \hyphenpenalty=-100
  210. \doublehyphendemerits=-100
  211. \finalhyphendemerits=-100
  212. \directlua{
  213. paragraphs(words_ck, 2, 40)
  214. }
  215. \end{multicols}
  216. \begin{multicols}{2}[\subsubsection{Triple consonant rule}]
  217. \selectlanguage{german}
  218. % Make hyphenation desirable.
  219. \hyphenpenalty=-100
  220. \doublehyphendemerits=-100
  221. \finalhyphendemerits=-100
  222. \directlua{
  223. paragraphs(words_triple, 2, 50)
  224. }
  225. \end{multicols}
  226. % Output a list of hyphenated words to log file.
  227. \begin{otherlanguage}{german}
  228. \showhyphens{
  229. \directlua{
  230. for _,word in ipairs(words_ck) do
  231. tex.print(word .. ' ')
  232. end
  233. for _,word in ipairs(words_triple) do
  234. tex.print(word .. ' ')
  235. end
  236. }
  237. }
  238. \end{otherlanguage}
  239. \end{document}