/src/misc/how-to-mirror.html

http://github.com/perlorg/cpanorg · HTML · 200 lines · 179 code · 15 blank · 6 comment · 0 complexity · b5cef6464444f63a17cdad1263112283 MD5 · raw file

  1. [%
  2. page.import({
  3. title => "How to mirror CPAN",
  4. section => 'home',
  5. stub => '../',
  6. });
  7. PROCESS "tpl/data/cpan-stats";
  8. %]
  9. [% page.head = BLOCK %]
  10. <!-- Copyright Ask Bjørn Hansen ask@perl.org 2011-2013 All Rights Reserved
  11. You may distribute this document either under the Artistic License
  12. (comes with Perl) or the GNU Public License, whichever suits you.
  13. The CPAN Logo provided by J.C. Thorpe.
  14. You are not allowed to remove or alter these comments.
  15. -->
  16. [% END %]
  17. <h1>
  18. How to mirror CPAN
  19. </h1>
  20. <p>
  21. There are several ways to mirror CPAN depending upon what you want to
  22. achieve.
  23. </p>
  24. <h2>
  25. How do I create a private or offline mirror?
  26. </h2>
  27. <p>
  28. <a href="https://metacpan.org/pod/distribution/CPAN-Mini/bin/minicpan">minicpan</a>
  29. from <a href="https://metacpan.org/release/CPAN-Mini">CPAN::Mini</a> is the
  30. best tool for this. Also look at <a href=
  31. "https://metacpan.org/release/CPAN-Mini-Inject">CPAN::Mini::Inject</a>
  32. which allows you to add your own modules into your private mirror.
  33. </p>
  34. <h2>
  35. Requirements for a full / public mirror
  36. </h2>
  37. <ul>
  38. <li>Good internet connectivity
  39. </li>
  40. <li>Around 1GB of storage space for just the current modules.
  41. </li>
  42. <li>Around [%
  43. cpan_gb = cpan_stats.disk_usage.gb;
  44. cpan_gb = cpan_gb.remove('\..*');
  45. cpan_gb = cpan_gb + 2;
  46. cpan_gb %]GB of storage space for the full mirror.
  47. </li>
  48. </ul>
  49. <p>It's highly recommended that you also subscribe to the announcements-only
  50. <a href="http://www.nntp.perl.org/group/perl.cpan.mirrors">cpan-mirrors
  51. mailing list</a> by emailing cpan-mirrors-subscribe at perl.org.
  52. </p>
  53. <h2>
  54. Tools
  55. </h2>
  56. <p>
  57. <a href="https://metacpan.org/release/CPAN-Mini">CPAN::Mini</a> provides
  58. you with a minimal mirror of <a href="http://www.cpan.org">CPAN</a> (the
  59. latest version of all modules). This makes working offline easy, it is the
  60. best tool if you are running a private mirror.
  61. </p>
  62. <p>
  63. <em>New:</em> <a href="#Instant_mirroring">rrr-client</a> allows
  64. instant mirroring, and should be used on official public mirrors where
  65. possible. See instant mirroring <a href=
  66. "#Instant_mirroring">instructions</a>.
  67. </p>
  68. <p>
  69. <a href="http://rsync.samba.org/">rsync</a> is the best tool if you need to
  70. mirror the whole of CPAN or if you are providing a public mirror. Rsync
  71. <a href="#rsync">Instructions</a>.
  72. </p>
  73. <p>
  74. Only use FTP if these other methods are absolutely impossible. Never mirror
  75. with HTTP - you will end up with a million duplicate files in tens of
  76. gigabytes.
  77. </p>
  78. <h2>
  79. Which CPAN Mirror should I use?
  80. </h2>
  81. <p>
  82. You can find your nearest rsync enabled site on <a href=
  83. "http://www.cpan.org/SITES.html">http://www.cpan.org/SITES.html</a>, or use
  84. <a href="http://www.cpan.org/indices/mirrors.json">mirrors.json</a>
  85. especially if you are building a tool which lets the user select a mirror.
  86. </p>
  87. <p>
  88. You can also sync from <code>rsync://cpan-rsync.perl.org/CPAN/</code> (the
  89. "tier 1 mirrors"), though you currently might get better
  90. performance from a "local" mirror.
  91. </p>
  92. <h2>
  93. <a name="rsync" id="rsync">Using rsync</a>
  94. </h2>
  95. <p>
  96. Please limit to once or twice a day. For more frequent updates please see
  97. <a href="#Instant_mirroring">Instant mirroring</a>.
  98. </p>
  99. <p>
  100. <em>On Unix systems</em>
  101. </p>
  102. <pre>
  103. /usr/bin/rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
  104. </pre>
  105. <p>
  106. Using 'crontab' you can make rsync run once a day, for example<br>
  107. <tt>40 4 * * * sleep $(expr $RANDOM \% 7200); /usr/bin/rsync -a --delete
  108. cpan-rsync.perl.org::CPAN /project/CPAN/</tt><br>
  109. The "sleep $(...);" statement makes the command delay up to 2 hours before
  110. running rsync; the advantage of this is that you (and everybody else) won't
  111. access the mirror at the same time.
  112. </p>
  113. <p>
  114. Unless you are mirroring to an SSD you might get timeouts using --delete-after
  115. when many symlinks are being purged. Using --delete will work properly.
  116. </p>
  117. <p>
  118. If you have a problem with permissions (files are created with mode
  119. <tt>-rw-------</tt>), set <tt>umask</tt> in your cronjob :<br>
  120. <tt>40 4 * * * umask 022 ; sleep ... ; /usr/bin/rsync ...</tt><br>
  121. The <tt>umask 022</tt> allows rsync to set proper permissions for
  122. files and directories.
  123. </p>
  124. <p>
  125. <em>On Windows systems</em>
  126. </p>
  127. <pre>
  128. C:\Program Files\Rsync\rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
  129. </pre>
  130. <p>
  131. Using the 'AT' tool, you can schedule rsync to run daily, for example:<br>
  132. <tt>AT 20:00 /every:M,T,W,Th,F,S,Su "C:\Program Files\Rsync\rsync -a
  133. --delete cpan-rsync.perl.org::CPAN /project/CPAN/"</tt><br>
  134. </p>
  135. <h2 id="Public_mirror">
  136. How do I create a public mirror?
  137. </h2>
  138. <ul>
  139. <li>Consider <a href="#Instant_mirroring">Instant mirroring</a>, required
  140. if you wish to be a tier 1 mirror, or..
  141. </li>
  142. <li>rsync once a day
  143. </li>
  144. <li>Provide (in order of preference) rsync, HTTP and/or FTP public access
  145. </li>
  146. <li>To be added to <a href=
  147. "http://www.cpan.org/SITES.html">http://www.cpan.org/SITES.html</a> and
  148. <a href="http://www.cpan.org/indices/mirrors.json">mirrors.json</a> please
  149. complete the <a href="http://www.cpan.org/MIRRORED.BY">template</a>
  150. confirming the public accessible URLs to your mirror: rsync, ftp, http and
  151. email it to cpan@perl.org.
  152. </li>
  153. </ul>
  154. <h2>
  155. <a name="Instant_mirroring" id="Instant_mirroring">Instant mirroring</a>
  156. </h2>
  157. <p>
  158. "Instant mirroring" keeps your CPAN mirror up-to-date by continuously
  159. tracking the CPAN master; picking up the changes from the master, a short
  160. time (minutes) after they occur.
  161. </p>
  162. <p>
  163. Instant mirroring is used for all Tier 1 mirrors (so
  164. cpan-rsync.perl.org stays in sync across mirrors).
  165. </p>
  166. <p>
  167. To use "instant mirroring", you need a special client: "rrr-client" or
  168. "iim".
  169. </p>
  170. <p>
  171. <a href="https://metacpan.org/pod/distribution/File-Rsync-Mirror-Recent/bin/rrr-client">"rrr-client"</a> is part of the <a href=
  172. "https://metacpan.org/release/File-Rsync-Mirror-Recent">File::Rsync::Mirror::Recent</a>
  173. (also known as <code>rrr</code>) package ; it is the official client, used
  174. on the CPAN master to get updates from <a href=
  175. "http://pause.perl.org/">PAUSE</a> : the true heart and soul of "all things
  176. perl", see the <a href=
  177. "https://github.com/perlorg/cpanorg/wiki/Instant-update-mirroring">setup
  178. guide</a> for more details.
  179. </p>
  180. <p>
  181. <a href="http://www.staff.science.uu.nl/~penni101/iim">"iim"</a> is an
  182. alternative for "rrr-client" ; basically it does the same thing, but it is
  183. more efficient (on start-up) and has some features that may be helpful to
  184. CPAN mirror operators.
  185. </p>