/examples/luatex/non-standard-hyphenation-german/non-standard-hyphenation-german.tex
LaTeX | 258 lines | 229 code | 19 blank | 10 comment | 0 complexity | f012beb89a6ab1fb2af1f34c649c0d46 MD5 | raw file
Possible License(s): AGPL-3.0
- % -*- coding: utf-8 -*-
- \listfiles
- \documentclass{article}
- \usepackage{fontspec}
- \usepackage{multicol}
- \usepackage[main=english, german]{babel}
- \usepackage{pdnm_nonstd-hyph-de-1901}
- \begin{document}
- \section{Automatic non-standard hyphenation with Lua\TeX}
- This is an example application of the padrinoma package illustrating
- automatic German non-standard hyphenation with Lua\TeX. No explicit
- non-standard hyphenation mark-up has been used in the source document.
- \subsection{German non-standard hyphenation rules}
- Besides standard hyphenation, in traditional German orthography, there
- are two non-standard hyphenation rules that call for more complex
- operations than the mere insertion of a hyphen at the end of a line:
- \begin{description}
- \item[ck-rule] When hyphenated, letters \emph{ck} turn into
- \emph{k-k}:\par
- \begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
- \emph{Dackel} vs. \emph{Dak-kel} & (dachshund)\\
- \end{tabular*}
- \item [triple consonant rule] In compound words, of three equal adjacent
- consonants followed by a vowel, one consonant is cancelled out. As an
- example, the words \emph{Schiff} and \emph{Fahrt} can be combined to
- the word \emph{Schiffahrt}. The second part of this rule says that
- during hyphenation, cancelled consonants reappear:\par
- \begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
- \emph{Schiffahrt} vs. \emph{Schiff-fahrt} & (shipping)\\
- \end{tabular*}
- \end{description}
- But note, not every \emph{ck} is subject to hyphenation and the pattern
- ‘word boundary with double consonant followed by a vowel’ indeed rarely
- stems from application of the three consonant rule:\strut\par
- \noindent\begin{tabular*}{\linewidth}{l@{\extracolsep{\fill}}r}
- \emph{Steck-dose}, not Stek-kdose & (socket)\\
- \emph{Rohstoff-industrie}, not Rohstoff-findustrie & (extractive industry)\\
- \end{tabular*}
- \subsection{Handling non-standard hyphenation manually}
- Traditional \TeX\ provides the \verb+\discretionary+ command for manual
- handling of non-standard hyphenation. Using this command, the German
- word \emph{Dackel} can be typed-in as
- \verb+Da\discretionary{k-}{}{c}kel+. The Babel and Polyglossia packages
- provide short-cuts so that one can also type \verb+Da"ckel+. But the
- disadvantages of these solutions are that, in languages demanding
- non-standard hyphenation, authors need to manually apply non-standard
- hyphenation mark-up throughout a text, which makes a source document
- look cluttered and---more important---application of the mark-up demands
- mental attention and distracts from the actual task of writing.
- With Lua\TeX, the situation improved. Hyphenation exceptions may now
- contain \verb+\discretionary+ commands, like
- \verb+\hyphenation{Da{k-}{}{c}kel}+, so that non-standard hyphenation
- mark-up isn't needed anymore for the words handled in the preamble of a
- document. But still, that way, only words maintained in an explicit
- list become subject to non-standard hyphenation.
- \subsection{Automatic non-standard hyphenation}
- The \texttt{padrinoma} package is an attempt to solve the
- above-mentioned problems. The following sections contain a selection of
- German words and named entities (changing at every compilation run)
- where non-standard hyphenation may or may not be desirable and is
- applied automatically. The document source contains neither mark-up nor
- hyphenation exceptions. Discretionaries are inserted fully
- automatically at Lua\TeX's node level at positions indicated by
- dedicated non-standard hyphenation patterns. Patterns are read from
- file
- \begin{center}
- \verb+examples/patterns/hyph-de-1901-nonstd.pat.txt+
- \end{center}
- A list of fully hyphenated words can be found in the \verb+log+ file.
- For debugging purposes, a list of all words with non-standard
- hyphenation patterns only applied is written to a file with the name of
- the file patterns were read from augmented by the extension
- \verb+.spots+. A hyphen indicates where non-standard hyphenation is to
- be applied within a word, like \verb+Dac-kel+ or \verb+Schif-fahrt+.
- Wrong matches, such as \verb+Rebec-ka+ (non-standard hyphenation is
- discouraged within names) or \verb+Schlep-panker+, are due to still
- imperfect non-standard hyphenation patterns. Patterns are subject to
- further improvements. If you find examples of wrong German non-standard
- hyphenation with the current pattern set, please send them to
- \verb+trennmuster@dante.de+.
- % # is used in Lua code.
- \catcode`\#=12
- \directlua{
- words_ck = {
- % random words
- 'Acker', 'Ackerböschung', 'Attacke', 'auflockern',
- 'backen', 'Backstube', 'Bestecke', 'Bestecks', 'blicken',
- 'Cricket', 'Kricket',
- 'Dackel', 'Deckel', 'Dickicht', 'dreckig', 'dreieckig', 'drucken',
- 'Ecke', 'eckig', 'entzückend', 'entzückt', 'erschrickst',
- 'flackern', 'Fleck', 'fleckige', 'flicken', 'Flickschuster',
- 'Flocken',
- 'Gegacker', 'Glöckchen', 'Glocke', 'gluckern',
- 'Hacken', 'Hecke', 'Heckmeck',
- 'Hockey',
- 'Jockey',
- 'Lackschaden', 'Leck', 'lecker', 'lockig', 'Lücke',
- 'packen', 'packst', 'packt', 'pflücken', 'Pickel', 'Pickelhaube',
- 'Pickhacke', 'Picknickkorb',
- 'recken', 'reckst', 'Reckstange', 'Röcke', 'Rucksack', 'Rücken',
- 'Rückerstattung',
- 'Sack', 'schlecken', 'schmecke', 'Schnecke', 'Schockwelle',
- 'Socken', 'Steckdose', 'stecken',
- 'Stockfisch', 'Stockwerk', 'strecken', 'streckst', 'Strecksprung',
- 'Streckung', 'Stuckateur', 'Stücke', 'Stücken', 'Stückchen',
- 'stückweise',
- 'Weckruf', 'Wicke', 'Wicklung',
- 'zickig', 'Zickzack', 'Zucker', 'zweckmäßig',
- 'Zuckerbäcker', 'Zucker-Bäcker',
- % systematic letter combinations
- 'Blockade', 'Deckadresse',%cka
- 'Drucker', 'Abdruckerlaubnis',%cke
- 'lockig', 'Schmuckindustrie',%cki
- 'Backofen',%cko
- 'Entdeckung', 'Druckunterschied',%cku
- 'Strickjacke',%ckj
- 'Druckänderung',%ckä
- 'Backöfen',%ckö
- 'Rückübertragung',%ckü
- % named entities
- 'Deckert',
- 'Eckart', 'Eckehard', 'Eckehardt', 'Eckhard', 'Eckhardt',
- 'Eckernförde',
- 'Fricke', 'Fricktal',
- 'Glienicke',
- 'Hendricks',
- 'Hockenheim',
- 'Huckleberry',
- 'Innsbruck', 'Innsbrucker', 'Innsbrucks',
- 'Knickerbocker',
- 'Kuckuck', 'Kuckucks',
- 'Lübeck', 'Lübecker', 'Lübecks', 'Lübbecke',
- 'Luckenwalde', 'Luckner',
- 'Mackie Messer',
- 'Mecklenburg',
- 'Mount McKinley',
- 'Neckar', 'Neckarsulm',
- 'Osnabrück', 'Osnabrücker', 'Osnabrücks',
- 'Packard',
- 'Pückler',
- 'Rebecka',
- 'Recklinghausen', 'Recknagel',
- 'Rockefeller', 'Rockford', 'Rocky Mountains',
- 'Rostock', 'Rostocker', 'Rostocks',
- 'Saarbrücken',
- 'Schmöckwitz',
- 'Schreckenberg',
- 'Schweickhardt',
- 'Senckenberg',
- 'Sickingen',
- 'Stockerau', 'Stockhausen', 'Stockholm',
- 'Tresckow',
- 'Uckermark', 'Ueckermünde',
- 'Weitzsäcker', 'Weizsäcker',
- 'Winckelmann',
- 'Yorck', 'Yorcks', 'Yorckscher',
- 'Zweibrücken',
- 'Zwickau',
- }
- words_triple = {
- % triple consonant words
- 'Baustoffabrik',
- 'Schiffahrt',
- 'Kunststoffasern',
- 'Kunststoffenster',
- 'Baustoffirma',
- 'Zellstoffirma',
- 'schifförmig',
- 'Geröllawine',
- 'Metallegierung',
- 'Abfülleistung',
- 'Nullösung',
- 'Dämmaßnahme',
- 'Klemmechanismus',
- 'Kammolch',
- 'Brennessel', 'Brennessel-Tee',
- 'Kennummer',
- 'Kreppapier',
- 'Kippunkt',
- 'Ballettanz',
- 'Schrottank',
- 'Fetteilchens',
- 'Schnittiefe',
- % non-triple consonant words
- 'Baustoffindustrie',
- 'Brotteig',
- 'Knettiere',
- 'Kunststoffascher',
- 'Kunststoffindustrie',
- 'Kunststofflasche',
- 'Schleppanker',
- 'Schrottanker',
- 'Bettruhe',
- }
- function paragraphs(words, num_p, num_w)
- for i = 1,num_p do
- for w = 1,num_w do
- tex.print(words[math.random(#words)] .. ' ')
- end
- tex.print('\par')
- end
- end
- math.randomseed(os.time())
- }
- \subsection{Non-standard hyphenation examples}
- \begin{multicols}{5}[\subsubsection{\emph{ck} rule}]
- \selectlanguage{german}
- % Make hyphenation desirable.
- \hyphenpenalty=-100
- \doublehyphendemerits=-100
- \finalhyphendemerits=-100
- \directlua{
- paragraphs(words_ck, 2, 40)
- }
- \end{multicols}
- \begin{multicols}{2}[\subsubsection{Triple consonant rule}]
- \selectlanguage{german}
- % Make hyphenation desirable.
- \hyphenpenalty=-100
- \doublehyphendemerits=-100
- \finalhyphendemerits=-100
- \directlua{
- paragraphs(words_triple, 2, 50)
- }
- \end{multicols}
- % Output a list of hyphenated words to log file.
- \begin{otherlanguage}{german}
- \showhyphens{
- \directlua{
- for _,word in ipairs(words_ck) do
- tex.print(word .. ' ')
- end
- for _,word in ipairs(words_triple) do
- tex.print(word .. ' ')
- end
- }
- }
- \end{otherlanguage}
- \end{document}