- <h2>Ruby - Reading A Remote Zip File</h2>
- <span class="date">01 OCTOBER 2013</span>
- <section>
- <p>I need to access a remote zip file and this is something that works:</p>
- <div class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1"># In the console, gem install "httparty"</span>
- <span class="nb">require</span> <span class="s2">"httparty"</span>
- <span class="c1"># In the console, gem install "rubyzip"</span>
- <span class="nb">require</span> <span class="s2">"zip"</span>
- <span class="c1"># Get the contents of the remote zip file via HTTParty</span>
- <span class="c1"># and write it into a temp zip file</span>
- <span class="n">zipfile</span> <span class="o">=</span> <span class="no">Tempfile</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">"file"</span><span class="p">)</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">binmode</span> <span class="c1"># This might not be necessary depending on the zip file</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="no">HTTParty</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"http://localhost:3000/file.zip"</span><span class="p">)</span><span class="o">.</span><span class="n">body</span><span class="p">)</span>
- <span class="n">zipfile</span><span class="o">.</span><span class="n">close</span>
- <span class="c1"># Unzip the temp zip file and process the contents</span>
- <span class="c1"># Let garbage collection delete the temp zip file</span>
- <span class="no">Zip</span><span class="o">::</span><span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">zipfile</span><span class="o">.</span><span class="n">path</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">file</span><span class="o">|</span>
- <span class="n">file</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">content</span><span class="o">|</span>
- <span class="n">data</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
- <span class="nb">puts</span> <span class="n">data</span>
- <span class="c1"># Do whatever you want with the contents</span>
- <span class="k">end</span>
- <span class="k">end</span></code></pre></div>
- <p>The code is simple, but at the start, I kept getting an error when unzipping the temp zip file,
- and I thought I was doing something wrong.</p>
- <pre><code>End-of-central-directory signature not found
- </code></pre>
- <p>Did some debugging and finally figured out that the problem was with the remote zip file
- - because the file was not fully constructed even though I had a link to it.</p>
- <p>The remote zip file link was actually returned by an earlier API call to an external service
- that also triggered the building of the remote zip file.</p>
- <p>Moral of the story? Trust my code.</p>
- <p>Anyway, <a href="http://blog.huangzhimin.com/2012/10/02/avoid-using-rubyzip/">RubyZip is poor in performance</a>. Might want to try <a href="http://zipruby.rubyforge.org/">ZipRuby</a> instead.</p>
