/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/index.html
HTML | 367 lines | 256 code | 106 blank | 5 comment | 0 complexity | 5f9eca5cca81c774422b2c2a4dc955b7 MD5 | raw file
- <!DOCTYPE html>
- <!--[if IEMobile 7 ]><html class="no-js iem7"><![endif]-->
- <!--[if lt IE 9]><html class="no-js lte-ie8"><![endif]-->
- <!--[if (gt IE 8)|(gt IEMobile 7)|!(IEMobile)|!(IE)]><!--><html class="no-js" lang="en"><!--<![endif]-->
- <head>
- <meta charset="utf-8">
- <title>Things I Remember to Write Down: Bash adventures: read a single character, even if it's a newline</title>
- <meta name="author" content="Jay Adkisson">
-
- <!-- http://t.co/dKP3o1e -->
- <meta name="HandheldFriendly" content="True">
- <meta name="MobileOptimized" content="320">
- <meta name="viewport" content="width=device-width, initial-scale=1">
-
- <link rel="canonical" href="http://write.jayferd.us/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/"/>
- <link href="/favicon.png" rel="shortcut icon" />
- <link href="/stylesheets/screen.css" media="screen, projection" rel="stylesheet" type="text/css">
- <script src="/javascripts/modernizr-2.0.js"></script>
- <script src="http://s3.amazonaws.com/ender-js/jeesh.min.js"></script>
- <script src="/javascripts/octopress.js" type="text/javascript"></script>
-
- <link href="/atom.xml" rel="alternate" title="Things I Remember to Write Down" type="application/atom+xml"/>
-
-
- <script type="text/javascript">
- (function() {
- var script = document.createElement('script'); script.type = 'text/javascript'; script.async = true;
- script.src = 'https://apis.google.com/js/plusone.js';
- var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(script, s);
- })();
- </script>
-
- <script type="text/javascript">
- (function(){
- var twitterWidgets = document.createElement('script');
- twitterWidgets.type = 'text/javascript';
- twitterWidgets.async = true;
- twitterWidgets.src = 'http://platform.twitter.com/widgets.js';
- document.getElementsByTagName('head')[0].appendChild(twitterWidgets);
- })();
- </script>
- <!--Fonts from Google's Web font directory at http://google.com/webfonts -->
- <link href='http://fonts.googleapis.com/css?family=PT+Serif:regular,italic,bold,bolditalic' rel='stylesheet' type='text/css'>
- <link href='http://fonts.googleapis.com/css?family=PT+Sans:regular,italic,bold,bolditalic' rel='stylesheet' type='text/css'>
- </head>
- <body >
- <header><hgroup>
- <h1><a href="/">Things I Remember to Write Down</a></h1>
-
- </hgroup>
- </header>
- <nav role=navigation><ul role=subscription data-subscription="rss">
- <li><a href="/atom.xml" rel="subscribe-rss" title="subscribe via RSS">RSS</a></li>
-
- </ul>
- <form action="http://google.com/search" method="get">
- <fieldset role="site-search">
- <input type="hidden" name="q" value="site:write.jayferd.us" />
- <input class="search" type="text" name="q" results="0" placeholder="Search"/>
- </fieldset>
- </form>
- <ul role=main-navigation>
- <li><a href="/">Blog</a></li>
- <li><a href="/blog/archives">Archives</a></li>
- </ul>
- </nav>
- <div id="main">
- <div id="content">
- <div>
- <article class="hentry">
-
- <header>
-
- <h1 class="entry-title">Bash Adventures: Read a Single Character, Even if It’s a Newline</h1>
-
-
- <p class="meta">
- <time datetime="2011-01-12 14:16:00 -0800" pubdate updated >Jan 12<span>th</span>, 2011</time>
- </p>
-
- </header>
- <div class="entry-content"><h3>tl;dr</h3>
- <div class="highlight"><pre><code class="bash">getc<span class="o">()</span> <span class="o">{</span>
- <span class="nv">IFS</span><span class="o">=</span> <span class="nb">read</span> -r -n1 -d <span class="s1">''</span> <span class="s2">"$@"</span>
- <span class="o">}</span>
- getc ch
- <span class="nb">echo</span> <span class="s2">"$ch"</span> <span class="c"># don't forget to quote it!</span>
- </code></pre>
- </div>
- <h3>the explanation</h3>
- <!-- more -->
- <p>So it turns out that it’s pretty difficult to read exactly one character from a stream.
- From the manual for <code>read</code> (usually in <code>man builtins</code>):</p>
- <pre><code>-n nchars
- read returns after reading nchars characters rather than
- waiting for a complete line of input, but honor a delim‐
- iter if fewer than nchars characters are read before the
- delimiter.
- </code></pre>
- <p>The naive solution, then would be:</p>
- <div class="highlight"><pre><code class="bash">getc<span class="o">()</span> <span class="o">{</span>
- <span class="nb">read</span> -n1 <span class="s2">"$@"</span>
- <span class="o">}</span>
- <span class="k">while </span>getc ch; <span class="k">do</span>
- <span class="k"> </span><span class="nb">echo</span> <span class="s2">"[$ch]"</span>;
- <span class="k">done</span> <span class="s"><<EOF</span>
- <span class="s">one</span>
- <span class="s">two</span>
- <span class="s">three</span>
- <span class="s">four</span>
- <span class="s">EOF</span>
- </code></pre>
- </div>
- <p>The output from this is:</p>
- <pre><code>[o]
- [n]
- [e]
- []
- [t]
- [w]
- [o]
- []
- [t]
- [h]
- [r]
- [e]
- [e]
- []
- [f]
- [o]
- [u]
- [r]
- []
- </code></pre>
- <p>Wait, what happened to the newlines – why are they returning empty characters? Well, it turns out there are two things going on here. Again from the manual:</p>
- <pre><code>The characters in the value of the IFS variable are used to split the line into words.
- </code></pre>
- <p>…and the newline is in the IFS. So let’s zero out <code>IFS</code>:</p>
- <div class="highlight"><pre><code class="bash">getc<span class="o">()</span> <span class="o">{</span>
- <span class="nv">IFS</span><span class="o">=</span> <span class="nb">read</span> -n1 <span class="s2">"$@"</span>
- <span class="o">}</span>
- </code></pre>
- </div>
- <p>unfortunately, this outputs the same thing, but for a different reason.
- Remember: <code>read -n nchars</code> will still "honor a delimiter if fewer than <code>nchars</code> characters are read before the delimiter".
- And the default delimiter (specified with <code>-d delim</code>) is the newline.
- So the next solution is to use an empty delimiter.</p>
- <p>Now our <code>getc</code> looks like:</p>
- <div class="highlight"><pre><code class="bash">getc<span class="o">()</span> <span class="o">{</span>
- <span class="nv">IFS</span><span class="o">=</span> <span class="nb">read</span> -n1 -d <span class="s1">''</span> <span class="s2">"$@"</span>
- <span class="o">}</span>
- </code></pre>
- </div>
- <p>(NOTE: the space between the <code>-d</code> and the <code>''</code> is not optional. This confused me the first time I was solving the problem, and I had suggested using the EOF character. Not only that, I suggested obtaining an EOF with <code>"$(echo -e '\004')</code>. The empty delimiter is a better way to solve this particular problem, and <code>$'\004'</code> is a better way to get an EOF character.)</p>
- <p>Now we get a very different output:</p>
- <pre><code>[o]
- [n]
- [e]
- [
- ]
- [t]
- [w]
- [o]
- [
- ]
- [t]
- [h]
- [r]
- [e]
- [e]
- [
- ]
- [f]
- [o]
- [u]
- [r]
- [
- ]
- </code></pre>
- <p>Our newlines are back! Yay!</p>
- <p>There is, however, one more wrinkle, as helpfully pointed out by an anonymous commenter. <code>read</code> also likes to let you use backslashes to escape things. For example, with the current implementation of <code>getc</code>:</p>
- <div class="highlight"><pre><code class="bash"><span class="k">while </span>getc ch; <span class="k">do</span>
- <span class="k"> </span><span class="nb">echo</span> <span class="s2">"[$ch]"</span>
- <span class="k">done</span> <span class="s"><<EOF</span>
- <span class="s">one\</span>
- <span class="s">two</span>
- <span class="s">EOF</span>
- </code></pre>
- </div>
- <p>Yields the following output:</p>
- <pre><code>[o]
- [n]
- [e]
- [t]
- [w]
- [o]
- [
- ]
- </code></pre>
- <p>Note how the backslash after <code>one</code> was interpreted as an escape for the newline that follows. Luckily, <code>read</code> comes with an <code>-r</code> option:</p>
- <pre><code>-r Backslash does not act as an escape character. The back‐
- slash is considered to be part of the line. In particu‐
- lar, a backslash-newline pair may not be used as a line
- continuation.
- </code></pre>
- <p>So the final solution is:</p>
- <div class="highlight"><pre><code class="bash">getc<span class="o">()</span> <span class="o">{</span>
- <span class="nv">IFS</span><span class="o">=</span> <span class="nb">read</span> -r -n1 -d <span class="s1">''</span> <span class="s2">"$@"</span>
- <span class="o">}</span>
- </code></pre>
- </div>
- <p>EDIT 7/19/2011: Fixes and nuances to the original solution, as provided by <code>:)</code>.</p>
- </div>
- <footer>
- <p class="meta">
-
-
- <span class="byline author vcard">Posted by <span class="fn">Jay Adkisson</span></span>
-
- <time datetime="2011-01-12 14:16:00 -0800" pubdate updated >Jan 12<span>th</span>, 2011</time>
-
- </p>
-
- <div class="sharing">
-
- <a href="http://twitter.com/share" class="twitter-share-button" data-url="http://write.jayferd.us/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/" data-via="jayferd" data-counturl="http://write.jayferd.us/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/" >Tweet</a>
-
-
- <g:plusone size="medium"></g:plusone>
-
- </div>
-
- </footer>
- </article>
- <section>
- <h1>Comments</h1>
- <div id="disqus_thread"><div id="disqus_thread"></div>
- <script type="text/javascript">
- var disqus_shortname = 'writedown';
- var disqus_identifier = 'http://write.jayferd.us/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/';
- var disqus_url = 'http://write.jayferd.us/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/';
- //var disqus_developer = 1;
- (function() {
- var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
- dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
- (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
- })();
- </script>
- <noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
- </div>
- </section>
- </div>
- <aside role=sidebar>
-
- <section>
- <h1>Recent Posts</h1>
- <ul id="recent_posts">
-
- <li class="post">
- <a href="/blog/2011/01/12/bash-adventures-read-a-single-character-even-if-its-a-newline/">Bash adventures: read a single character, even if it’s a newline</a>
- </li>
-
- </ul>
- </section>
- <section>
- <h1>Latest Tweets</h1>
- <ul id="tweets">
- <li class="loading">Status updating…</li>
- </ul>
- <script type="text/javascript">
- $.domReady(function(){
- getTwitterFeed("jayferd", 4, false);
- });
- </script>
- <script src="/javascripts/twitter.js" type="text/javascript"> </script>
-
- <a href="http://twitter.com/jayferd" class="twitter-follow-button" data-width="208px" data-show-count="false">Follow @jayferd</a>
-
- </section>
-
- </aside>
- </div>
- </div>
- <footer><p>
- Copyright © 2011 - Jay Adkisson -
- <span class="credit">Powered by <a href="http://octopress.org">Octopress</a></span>
- </p>
- </footer>
- </body>
- </html>