/2013/10/01/ruby_reading_a_remote_zip_file/index.html
HTML | 143 lines | 128 code | 14 blank | 1 comment | 0 complexity | b02c607654ad7ddb52a3ccda3be325d9 MD5 | raw file
Possible License(s): MIT
- <!DOCTYPE html>
- <html>
- <head>
- <title>Ruby - Reading A Remote Zip File - WinstonYW</title>
- <meta content="text/html; charset=utf-8" http-equiv="content-type" />
- <meta content='Winston Teo Yong Wei' name='author' />
- <link href='/assets/css/application.css' media='screen' rel='stylesheet' type='text/css' />
- <!-- http://davidbcalhoun.com/2010/viewport-metatag -->
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
- <meta name="HandheldFriendly" content="True">
- <meta name="MobileOptimized" content="320">
- </head>
- <body>
- <header>
- <div class='title-color-bar'></div>
- <section id='profile'>
- <h1>
- <a href="/">WinstonYW</a>
- </h1>
- <h2>
- Just Another Tech Journal
- </h2>
- </section>
- <div class='aside-color-bar'></div>
- <section id="projects">
- <h2>Projects</h2>
- <ul>
- <li>
- <a href="https://github.com/winston/google_visualr">GoogleVisualr</a>
- </li>
- <li>
- <a href="https://github.com/winston/cactus">Cactus</a>
- </li>
- <li>
- <a href="http://gigest.herokuapp.com">Gigest App</a>
- </li>
- </ul>
- </section>
- <section id="elsewhere">
- <h2>Elsewhere</h2>
- <ul>
- <li>
- <a href="https://twitter.com/winstonyw">Twitter</a>
- </li>
- <li>
- <a href="https://github.com/winston">GitHub</a>
- </li>
- <li>
- <a href="http://www.linkedin.com/in/winstonyw">LinkedIn</a>
- </li>
- <li>
- <a href="mailto:winstonyw+blog@gmail.com">Email</a>
- </li>
- <li>
- <a href="#">RSS</a>
- </li>
- </ul>
- </section>
- </header>
- <div id='center'>
- <div class='color-bar'></div>
- <div id='content'>
- <article class="post">
- <h2>Ruby - Reading A Remote Zip File</h2>
- <span class="date">01 OCTOBER 2013</span>
- <section>
- <p>I need to access a remote zip file and this is something that works:</p>
- <div class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1"># In the console, gem install "httparty"</span>
- <span class="nb">require</span> <span class="s2">"httparty"</span>
- <span class="c1"># In the console, gem install "rubyzip"</span>
- <span class="nb">require</span> <span class="s2">"zip"</span>
- <span class="c1"># Get the contents of the remote zip file via HTTParty</span>
- <span class="c1"># and write it into a temp zip file</span>
- <span class="n">zipfile</span> <span class="o">=</span> <span class="no">Tempfile</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">"file"</span><span class="p">)</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">binmode</span> <span class="c1"># This might not be necessary depending on the zip file</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="no">HTTParty</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"http://localhost:3000/file.zip"</span><span class="p">)</span><span class="o">.</span><span class="n">body</span><span class="p">)</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">close</span>
- <span class="c1"># Unzip the temp zip file and process the contents</span>
- <span class="c1"># Let garbage collection delete the temp zip file</span>
- <span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">zipfile</span><span class="o">.</span><span class="n">path</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">file</span><span class="o">|</span>
- <span class="n">file</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">content</span><span class="o">|</span>
- <span class="n">data</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
- <span class="nb">puts</span> <span class="n">data</span>
- <span class="c1"># Do whatever you want with the contents</span>
- <span class="k">end</span>
- <span class="k">end</span></code></pre></div>
- <p>The code is simple, but at the start, I kept getting an error when unzipping the temp zip file,
- and I thought I was doing something wrong.</p>
- <pre><code>End-of-central-directory signature not found
- </code></pre>
- <p>Did some debugging and finally figured out that the problem was with the remote zip file
- - because the file was not fully constructed even though I had a link to it.</p>
- <p>The remote zip file link was actually returned by an earlier API call to an external service
- that also triggered the building of the remote zip file.</p>
- <p>Moral of the story? Trust my code.</p>
- <p>Anyway, <a href="http://blog.huangzhimin.com/2012/10/02/avoid-using-rubyzip/">RubyZip is poor in performance</a>. Might want to try <a href="http://zipruby.rubyforge.org/">ZipRuby</a> instead.</p>
- </section>
- </article>
- <article class="comments">
- <div id="disqus_thread"></div>
- <script type="text/javascript">
- var disqus_shortname = 'winstonyw';
- // var disqus_developer = 1;
- (function() {
- var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
- dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js';
- (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
- })();
- </script>
- <noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
- <a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
- </article>
- </div>
- </div>
- <footer>
- Just a boring footer. Nothing to see here lah. ~
- </footer>
- <script type="text/javascript">
- var _gaq = _gaq || [];
- _gaq.push(['_setAccount', 'UA-297001-2']);
- _gaq.push(['_trackPageview']);
- (function() {
- var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
- ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
- var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
- })();
- </script>
- </body>
- </html>