PageRenderTime 53ms CodeModel.GetById 24ms RepoModel.GetById 0ms app.codeStats 0ms

/engines/adva_cms/vendor/gems/rubypants-0.2.0/lib/ruby_pants.rb

https://github.com/DoktahWorm/adva_cms
Ruby | 490 lines | 159 code | 48 blank | 283 comment | 31 complexity | e34d7b8358233dbbae354cfb8ddd7f2f MD5 | raw file
Possible License(s): BSD-3-Clause, MIT, GPL-2.0
  1. #
  2. # = RubyPants -- SmartyPants ported to Ruby
  3. #
  4. # Ported by Christian Neukirchen <mailto:chneukirchen@gmail.com>
  5. # Copyright (C) 2004 Christian Neukirchen
  6. #
  7. # Incooporates ideas, comments and documentation by Chad Miller
  8. # Copyright (C) 2004 Chad Miller
  9. #
  10. # Original SmartyPants by John Gruber
  11. # Copyright (C) 2003 John Gruber
  12. #
  13. #
  14. # = RubyPants -- SmartyPants ported to Ruby
  15. #
  16. # == Synopsis
  17. #
  18. # RubyPants is a Ruby port of the smart-quotes library SmartyPants.
  19. #
  20. # The original "SmartyPants" is a free web publishing plug-in for
  21. # Movable Type, Blosxom, and BBEdit that easily translates plain ASCII
  22. # punctuation characters into "smart" typographic punctuation HTML
  23. # entities.
  24. #
  25. #
  26. # == Description
  27. #
  28. # RubyPants can perform the following transformations:
  29. #
  30. # * Straight quotes (<tt>"</tt> and <tt>'</tt>) into "curly" quote
  31. # HTML entities
  32. # * Backticks-style quotes (<tt>``like this''</tt>) into "curly" quote
  33. # HTML entities
  34. # * Dashes (<tt>--</tt> and <tt>---</tt>) into en- and em-dash
  35. # entities
  36. # * Three consecutive dots (<tt>...</tt> or <tt>. . .</tt>) into an
  37. # ellipsis entity
  38. #
  39. # This means you can write, edit, and save your posts using plain old
  40. # ASCII straight quotes, plain dashes, and plain dots, but your
  41. # published posts (and final HTML output) will appear with smart
  42. # quotes, em-dashes, and proper ellipses.
  43. #
  44. # RubyPants does not modify characters within <tt><pre></tt>,
  45. # <tt><code></tt>, <tt><kbd></tt>, <tt><math></tt> or
  46. # <tt><script></tt> tag blocks. Typically, these tags are used to
  47. # display text where smart quotes and other "smart punctuation" would
  48. # not be appropriate, such as source code or example markup.
  49. #
  50. #
  51. # == Backslash Escapes
  52. #
  53. # If you need to use literal straight quotes (or plain hyphens and
  54. # periods), RubyPants accepts the following backslash escape sequences
  55. # to force non-smart punctuation. It does so by transforming the
  56. # escape sequence into a decimal-encoded HTML entity:
  57. #
  58. # \\ \" \' \. \- \`
  59. #
  60. # This is useful, for example, when you want to use straight quotes as
  61. # foot and inch marks: 6'2" tall; a 17" iMac. (Use <tt>6\'2\"</tt>
  62. # resp. <tt>17\"</tt>.)
  63. #
  64. #
  65. # == Algorithmic Shortcomings
  66. #
  67. # One situation in which quotes will get curled the wrong way is when
  68. # apostrophes are used at the start of leading contractions. For
  69. # example:
  70. #
  71. # 'Twas the night before Christmas.
  72. #
  73. # In the case above, RubyPants will turn the apostrophe into an
  74. # opening single-quote, when in fact it should be a closing one. I
  75. # don't think this problem can be solved in the general case--every
  76. # word processor I've tried gets this wrong as well. In such cases,
  77. # it's best to use the proper HTML entity for closing single-quotes
  78. # ("<tt>&#8217;</tt>") by hand.
  79. #
  80. #
  81. # == Bugs
  82. #
  83. # To file bug reports or feature requests (except see above) please
  84. # send email to: mailto:chneukirchen@gmail.com
  85. #
  86. # If the bug involves quotes being curled the wrong way, please send
  87. # example text to illustrate.
  88. #
  89. #
  90. # == Authors
  91. #
  92. # John Gruber did all of the hard work of writing this software in
  93. # Perl for Movable Type and almost all of this useful documentation.
  94. # Chad Miller ported it to Python to use with Pyblosxom.
  95. #
  96. # Christian Neukirchen provided the Ruby port, as a general-purpose
  97. # library that follows the *Cloth API.
  98. #
  99. #
  100. # == Copyright and License
  101. #
  102. # === SmartyPants license:
  103. #
  104. # Copyright (c) 2003 John Gruber
  105. # (http://daringfireball.net)
  106. # All rights reserved.
  107. #
  108. # Redistribution and use in source and binary forms, with or without
  109. # modification, are permitted provided that the following conditions
  110. # are met:
  111. #
  112. # * Redistributions of source code must retain the above copyright
  113. # notice, this list of conditions and the following disclaimer.
  114. #
  115. # * Redistributions in binary form must reproduce the above copyright
  116. # notice, this list of conditions and the following disclaimer in
  117. # the documentation and/or other materials provided with the
  118. # distribution.
  119. #
  120. # * Neither the name "SmartyPants" nor the names of its contributors
  121. # may be used to endorse or promote products derived from this
  122. # software without specific prior written permission.
  123. #
  124. # This software is provided by the copyright holders and contributors
  125. # "as is" and any express or implied warranties, including, but not
  126. # limited to, the implied warranties of merchantability and fitness
  127. # for a particular purpose are disclaimed. In no event shall the
  128. # copyright owner or contributors be liable for any direct, indirect,
  129. # incidental, special, exemplary, or consequential damages (including,
  130. # but not limited to, procurement of substitute goods or services;
  131. # loss of use, data, or profits; or business interruption) however
  132. # caused and on any theory of liability, whether in contract, strict
  133. # liability, or tort (including negligence or otherwise) arising in
  134. # any way out of the use of this software, even if advised of the
  135. # possibility of such damage.
  136. #
  137. # === RubyPants license
  138. #
  139. # RubyPants is a derivative work of SmartyPants and smartypants.py.
  140. #
  141. # Redistribution and use in source and binary forms, with or without
  142. # modification, are permitted provided that the following conditions
  143. # are met:
  144. #
  145. # * Redistributions of source code must retain the above copyright
  146. # notice, this list of conditions and the following disclaimer.
  147. #
  148. # * Redistributions in binary form must reproduce the above copyright
  149. # notice, this list of conditions and the following disclaimer in
  150. # the documentation and/or other materials provided with the
  151. # distribution.
  152. #
  153. # This software is provided by the copyright holders and contributors
  154. # "as is" and any express or implied warranties, including, but not
  155. # limited to, the implied warranties of merchantability and fitness
  156. # for a particular purpose are disclaimed. In no event shall the
  157. # copyright owner or contributors be liable for any direct, indirect,
  158. # incidental, special, exemplary, or consequential damages (including,
  159. # but not limited to, procurement of substitute goods or services;
  160. # loss of use, data, or profits; or business interruption) however
  161. # caused and on any theory of liability, whether in contract, strict
  162. # liability, or tort (including negligence or otherwise) arising in
  163. # any way out of the use of this software, even if advised of the
  164. # possibility of such damage.
  165. #
  166. #
  167. # == Links
  168. #
  169. # John Gruber:: http://daringfireball.net
  170. # SmartyPants:: http://daringfireball.net/projects/smartypants
  171. #
  172. # Chad Miller:: http://web.chad.org
  173. #
  174. # Christian Neukirchen:: http://kronavita.de/chris
  175. #
  176. class RubyPants < String
  177. VERSION = "0.2"
  178. # Create a new RubyPants instance with the text in +string+.
  179. #
  180. # Allowed elements in the options array:
  181. #
  182. # 0 :: do nothing
  183. # 1 :: enable all, using only em-dash shortcuts
  184. # 2 :: enable all, using old school en- and em-dash shortcuts (*default*)
  185. # 3 :: enable all, using inverted old school en and em-dash shortcuts
  186. # -1 :: stupefy (translate HTML entities to their ASCII-counterparts)
  187. #
  188. # If you don't like any of these defaults, you can pass symbols to change
  189. # RubyPants' behavior:
  190. #
  191. # <tt>:quotes</tt> :: quotes
  192. # <tt>:backticks</tt> :: backtick quotes (``double'' only)
  193. # <tt>:allbackticks</tt> :: backtick quotes (``double'' and `single')
  194. # <tt>:dashes</tt> :: dashes
  195. # <tt>:oldschool</tt> :: old school dashes
  196. # <tt>:inverted</tt> :: inverted old school dashes
  197. # <tt>:ellipses</tt> :: ellipses
  198. # <tt>:convertquotes</tt> :: convert <tt>&quot;</tt> entities to
  199. # <tt>"</tt> for Dreamweaver users
  200. # <tt>:stupefy</tt> :: translate RubyPants HTML entities
  201. # to their ASCII counterparts.
  202. #
  203. def initialize(string, options=[2])
  204. super string
  205. @options = [*options]
  206. end
  207. # Apply SmartyPants transformations.
  208. def to_html
  209. do_quotes = do_backticks = do_dashes = do_ellipses = do_stupify = nil
  210. convert_quotes = false
  211. if @options.include? 0
  212. # Do nothing.
  213. return self
  214. elsif @options.include? 1
  215. # Do everything, turn all options on.
  216. do_quotes = do_backticks = do_ellipses = true
  217. do_dashes = :normal
  218. elsif @options.include? 2
  219. # Do everything, turn all options on, use old school dash shorthand.
  220. do_quotes = do_backticks = do_ellipses = true
  221. do_dashes = :oldschool
  222. elsif @options.include? 3
  223. # Do everything, turn all options on, use inverted old school
  224. # dash shorthand.
  225. do_quotes = do_backticks = do_ellipses = true
  226. do_dashes = :inverted
  227. elsif @options.include?(-1)
  228. do_stupefy = true
  229. else
  230. do_quotes = @options.include? :quotes
  231. do_backticks = @options.include? :backticks
  232. do_backticks = :both if @options.include? :allbackticks
  233. do_dashes = :normal if @options.include? :dashes
  234. do_dashes = :oldschool if @options.include? :oldschool
  235. do_dashes = :inverted if @options.include? :inverted
  236. do_ellipses = @options.include? :ellipses
  237. convert_quotes = @options.include? :convertquotes
  238. do_stupefy = @options.include? :stupefy
  239. end
  240. # Parse the HTML
  241. tokens = tokenize
  242. # Keep track of when we're inside <pre> or <code> tags.
  243. in_pre = false
  244. # Here is the result stored in.
  245. result = ""
  246. # This is a cheat, used to get some context for one-character
  247. # tokens that consist of just a quote char. What we do is remember
  248. # the last character of the previous text token, to use as context
  249. # to curl single- character quote tokens correctly.
  250. prev_token_last_char = nil
  251. tokens.each { |token|
  252. if token.first == :tag
  253. result << token[1]
  254. if token[1] =~ %r!<(/?)(?:pre|code|kbd|script|math)[\s>]!
  255. in_pre = ($1 != "/") # Opening or closing tag?
  256. end
  257. else
  258. t = token[1]
  259. # Remember last char of this token before processing.
  260. last_char = t[-1].chr
  261. unless in_pre
  262. t = process_escapes t
  263. t.gsub!(/&quot;/, '"') if convert_quotes
  264. if do_dashes
  265. t = educate_dashes t if do_dashes == :normal
  266. t = educate_dashes_oldschool t if do_dashes == :oldschool
  267. t = educate_dashes_inverted t if do_dashes == :inverted
  268. end
  269. t = educate_ellipses t if do_ellipses
  270. # Note: backticks need to be processed before quotes.
  271. if do_backticks
  272. t = educate_backticks t
  273. t = educate_single_backticks t if do_backticks == :both
  274. end
  275. if do_quotes
  276. if t == "'"
  277. # Special case: single-character ' token
  278. if prev_token_last_char =~ /\S/
  279. t = "&#8217;"
  280. else
  281. t = "&#8216;"
  282. end
  283. elsif t == '"'
  284. # Special case: single-character " token
  285. if prev_token_last_char =~ /\S/
  286. t = "&#8221;"
  287. else
  288. t = "&#8220;"
  289. end
  290. else
  291. # Normal case:
  292. t = educate_quotes t
  293. end
  294. end
  295. t = stupefy_entities t if do_stupefy
  296. end
  297. prev_token_last_char = last_char
  298. result << t
  299. end
  300. }
  301. # Done
  302. result
  303. end
  304. protected
  305. # Return the string, with after processing the following backslash
  306. # escape sequences. This is useful if you want to force a "dumb" quote
  307. # or other character to appear.
  308. #
  309. # Escaped are:
  310. # \\ \" \' \. \- \`
  311. #
  312. def process_escapes(str)
  313. str.gsub('\\\\', '&#92;').
  314. gsub('\"', '&#34;').
  315. gsub("\\\'", '&#39;').
  316. gsub('\.', '&#46;').
  317. gsub('\-', '&#45;').
  318. gsub('\`', '&#96;')
  319. end
  320. # The string, with each instance of "<tt>--</tt>" translated to an
  321. # em-dash HTML entity.
  322. #
  323. def educate_dashes(str)
  324. str.gsub(/--/, '&#8212;')
  325. end
  326. # The string, with each instance of "<tt>--</tt>" translated to an
  327. # en-dash HTML entity, and each "<tt>---</tt>" translated to an
  328. # em-dash HTML entity.
  329. #
  330. def educate_dashes_oldschool(str)
  331. str.gsub(/---/, '&#8212;').gsub(/--/, '&#8211;')
  332. end
  333. # Return the string, with each instance of "<tt>--</tt>" translated
  334. # to an em-dash HTML entity, and each "<tt>---</tt>" translated to
  335. # an en-dash HTML entity. Two reasons why: First, unlike the en- and
  336. # em-dash syntax supported by +educate_dashes_oldschool+, it's
  337. # compatible with existing entries written before SmartyPants 1.1,
  338. # back when "<tt>--</tt>" was only used for em-dashes. Second,
  339. # em-dashes are more common than en-dashes, and so it sort of makes
  340. # sense that the shortcut should be shorter to type. (Thanks to
  341. # Aaron Swartz for the idea.)
  342. #
  343. def educate_dashes_inverted(str)
  344. str.gsub(/---/, '&#8211;').gsub(/--/, '&#8212;')
  345. end
  346. # Return the string, with each instance of "<tt>...</tt>" translated
  347. # to an ellipsis HTML entity. Also converts the case where there are
  348. # spaces between the dots.
  349. #
  350. def educate_ellipses(str)
  351. str.gsub('...', '&#8230;').gsub('. . .', '&#8230;')
  352. end
  353. # Return the string, with "<tt>``backticks''</tt>"-style single quotes
  354. # translated into HTML curly quote entities.
  355. #
  356. def educate_backticks(str)
  357. str.gsub("``", '&#8220;').gsub("''", '&#8221;')
  358. end
  359. # Return the string, with "<tt>`backticks'</tt>"-style single quotes
  360. # translated into HTML curly quote entities.
  361. #
  362. def educate_single_backticks(str)
  363. str.gsub("`", '&#8216;').gsub("'", '&#8217;')
  364. end
  365. # Return the string, with "educated" curly quote HTML entities.
  366. #
  367. def educate_quotes(str)
  368. punct_class = '[!"#\$\%\'()*+,\-.\/:;<=>?\@\[\\\\\]\^_`{|}~]'
  369. str = str.dup
  370. # Special case if the very first character is a quote followed by
  371. # punctuation at a non-word-break. Close the quotes by brute
  372. # force:
  373. str.gsub!(/^'(?=#{punct_class}\B)/, '&#8217;')
  374. str.gsub!(/^"(?=#{punct_class}\B)/, '&#8221;')
  375. # Special case for double sets of quotes, e.g.:
  376. # <p>He said, "'Quoted' words in a larger quote."</p>
  377. str.gsub!(/"'(?=\w)/, '&#8220;&#8216;')
  378. str.gsub!(/'"(?=\w)/, '&#8216;&#8220;')
  379. # Special case for decade abbreviations (the '80s):
  380. str.gsub!(/'(?=\d\ds)/, '&#8217;')
  381. close_class = %![^\ \t\r\n\\[\{\(\-]!
  382. dec_dashes = '&#8211;|&#8212;'
  383. # Get most opening single quotes:
  384. str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)'(?=\w)/,
  385. '\1&#8216;')
  386. # Single closing quotes:
  387. str.gsub!(/(#{close_class})'/, '\1&#8217;')
  388. str.gsub!(/'(\s|s\b|$)/, '&#8217;\1')
  389. # Any remaining single quotes should be opening ones:
  390. str.gsub!(/'/, '&#8216;')
  391. # Get most opening double quotes:
  392. str.gsub!(/(\s|&nbsp;|--|&[mn]dash;|#{dec_dashes}|&#x201[34];)"(?=\w)/,
  393. '\1&#8220;')
  394. # Double closing quotes:
  395. str.gsub!(/(#{close_class})"/, '\1&#8221;')
  396. str.gsub!(/"(\s|s\b|$)/, '&#8221;\1')
  397. # Any remaining quotes should be opening ones:
  398. str.gsub!(/"/, '&#8220;')
  399. str
  400. end
  401. # Return the string, with each RubyPants HTML entity translated to
  402. # its ASCII counterpart.
  403. #
  404. # Note: This is not reversible (but exactly the same as in SmartyPants)
  405. #
  406. def stupefy_entities(str)
  407. str.
  408. gsub(/&#8211;/, '-'). # en-dash
  409. gsub(/&#8212;/, '--'). # em-dash
  410. gsub(/&#8216;/, "'"). # open single quote
  411. gsub(/&#8217;/, "'"). # close single quote
  412. gsub(/&#8220;/, '"'). # open double quote
  413. gsub(/&#8221;/, '"'). # close double quote
  414. gsub(/&#8230;/, '...') # ellipsis
  415. end
  416. # Return an array of the tokens comprising the string. Each token is
  417. # either a tag (possibly with nested, tags contained therein, such
  418. # as <tt><a href="<MTFoo>"></tt>, or a run of text between
  419. # tags. Each element of the array is a two-element array; the first
  420. # is either :tag or :text; the second is the actual value.
  421. #
  422. # Based on the <tt>_tokenize()</tt> subroutine from Brad Choate's
  423. # MTRegex plugin. <http://www.bradchoate.com/past/mtregex.php>
  424. #
  425. # This is actually the easier variant using tag_soup, as used by
  426. # Chad Miller in the Python port of SmartyPants.
  427. #
  428. def tokenize
  429. tag_soup = /([^<]*)(<[^>]*>)/
  430. tokens = []
  431. prev_end = 0
  432. scan(tag_soup) {
  433. tokens << [:text, $1] if $1 != ""
  434. tokens << [:tag, $2]
  435. prev_end = $~.end(0)
  436. }
  437. if prev_end < size
  438. tokens << [:text, self[prev_end..-1]]
  439. end
  440. tokens
  441. end
  442. end