PageRenderTime 29ms CodeModel.GetById 24ms app.highlight 2ms RepoModel.GetById 1ms app.codeStats 1ms

/src/misc/how-to-mirror.html

http://github.com/perlorg/cpanorg
HTML | 200 lines | 179 code | 15 blank | 6 comment | 0 complexity | b5cef6464444f63a17cdad1263112283 MD5 | raw file
  1[%
  2  page.import({
  3    title   => "How to mirror CPAN",
  4    section => 'home',
  5    stub => '../',
  6  });
  7
  8  PROCESS "tpl/data/cpan-stats";
  9
 10%]
 11
 12[% page.head = BLOCK %]
 13<!-- Copyright Ask Bjørn Hansen ask@perl.org 2011-2013 All Rights Reserved
 14     You may distribute this document either under the Artistic License
 15     (comes with Perl) or the GNU Public License, whichever suits you.
 16     The CPAN Logo provided by J.C. Thorpe.
 17     You are not allowed to remove or alter these comments.
 18-->
 19[% END %]
 20<h1>
 21    How to mirror CPAN
 22</h1>
 23<p>
 24    There are several ways to mirror CPAN depending upon what you want to
 25    achieve.
 26</p>
 27
 28<h2>
 29    How do I create a private or offline mirror?
 30</h2>
 31<p>
 32    <a href="https://metacpan.org/pod/distribution/CPAN-Mini/bin/minicpan">minicpan</a>
 33    from <a href="https://metacpan.org/release/CPAN-Mini">CPAN::Mini</a> is the
 34    best tool for this. Also look at <a href=
 35    "https://metacpan.org/release/CPAN-Mini-Inject">CPAN::Mini::Inject</a>
 36    which allows you to add your own modules into your private mirror.
 37</p>
 38
 39
 40<h2>
 41    Requirements for a full / public mirror
 42</h2>
 43<ul>
 44    <li>Good internet connectivity
 45    </li>
 46    <li>Around 1GB of storage space for just the current modules.
 47    </li>
 48    <li>Around [%
 49      cpan_gb = cpan_stats.disk_usage.gb;
 50      cpan_gb = cpan_gb.remove('\..*');
 51      cpan_gb = cpan_gb + 2;
 52      cpan_gb %]GB of storage space for the full mirror.
 53    </li>
 54</ul>
 55
 56<p>It's highly recommended that you also subscribe to the announcements-only
 57   <a href="http://www.nntp.perl.org/group/perl.cpan.mirrors">cpan-mirrors
 58   mailing list</a> by emailing cpan-mirrors-subscribe at perl.org.
 59</p>
 60
 61<h2>
 62    Tools
 63</h2>
 64<p>
 65    <a href="https://metacpan.org/release/CPAN-Mini">CPAN::Mini</a> provides
 66    you with a minimal mirror of <a href="http://www.cpan.org">CPAN</a> (the
 67    latest version of all modules). This makes working offline easy, it is the
 68    best tool if you are running a private mirror.
 69</p>
 70<p>
 71    <em>New:</em> <a href="#Instant_mirroring">rrr-client</a> allows
 72    instant mirroring, and should be used on official public mirrors where
 73    possible. See instant mirroring <a href=
 74    "#Instant_mirroring">instructions</a>.
 75</p>
 76<p>
 77    <a href="http://rsync.samba.org/">rsync</a> is the best tool if you need to
 78    mirror the whole of CPAN or if you are providing a public mirror. Rsync
 79    <a href="#rsync">Instructions</a>.
 80</p>
 81<p>
 82    Only use FTP if these other methods are absolutely impossible. Never mirror
 83    with HTTP - you will end up with a million duplicate files in tens of
 84    gigabytes.
 85</p>
 86
 87<h2>
 88    Which CPAN Mirror should I use?
 89</h2>
 90
 91<p>
 92    You can find your nearest rsync enabled site on <a href=
 93    "http://www.cpan.org/SITES.html">http://www.cpan.org/SITES.html</a>, or use
 94    <a href="http://www.cpan.org/indices/mirrors.json">mirrors.json</a>
 95    especially if you are building a tool which lets the user select a mirror.
 96</p>
 97
 98<p>
 99    You can also sync from <code>rsync://cpan-rsync.perl.org/CPAN/</code> (the
100    "tier 1 mirrors"), though you currently might get better
101    performance from a "local" mirror.
102</p>
103
104<h2>
105    <a name="rsync" id="rsync">Using rsync</a>
106</h2>
107<p>
108    Please limit to once or twice a day. For more frequent updates please see
109    <a href="#Instant_mirroring">Instant mirroring</a>.
110</p>
111<p>
112    <em>On Unix systems</em>
113</p>
114<pre>
115/usr/bin/rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
116</pre>
117<p>
118    Using 'crontab' you can make rsync run once a day, for example<br>
119    <tt>40 4 * * * sleep $(expr $RANDOM \% 7200); /usr/bin/rsync -a --delete
120    cpan-rsync.perl.org::CPAN /project/CPAN/</tt><br>
121    The "sleep $(...);" statement makes the command delay up to 2 hours before
122    running rsync; the advantage of this is that you (and everybody else) won't
123    access the mirror at the same time.
124</p>
125<p>
126    Unless you are mirroring to an SSD you might get timeouts using --delete-after
127    when many symlinks are being purged. Using --delete will work properly.
128</p>
129<p>
130    If you have a problem with permissions (files are created with mode
131    <tt>-rw-------</tt>), set <tt>umask</tt> in your cronjob :<br>
132    <tt>40 4 * * * umask 022 ; sleep ... ; /usr/bin/rsync ...</tt><br>
133    The <tt>umask 022</tt> allows rsync to set proper permissions for
134    files and directories.
135</p>
136<p>
137    <em>On Windows systems</em>
138</p>
139<pre>
140C:\Program Files\Rsync\rsync -av --delete cpan-rsync.perl.org::CPAN /project/CPAN/
141</pre>
142<p>
143    Using the 'AT' tool, you can schedule rsync to run daily, for example:<br>
144    <tt>AT 20:00 /every:M,T,W,Th,F,S,Su "C:\Program Files\Rsync\rsync -a
145    --delete cpan-rsync.perl.org::CPAN /project/CPAN/"</tt><br>
146</p>
147
148
149<h2 id="Public_mirror">
150    How do I create a public mirror?
151</h2>
152<ul>
153    <li>Consider <a href="#Instant_mirroring">Instant mirroring</a>, required
154    if you wish to be a tier 1 mirror, or..
155    </li>
156    <li>rsync once a day
157    </li>
158    <li>Provide (in order of preference) rsync, HTTP and/or FTP public access
159    </li>
160    <li>To be added to <a href=
161    "http://www.cpan.org/SITES.html">http://www.cpan.org/SITES.html</a> and
162    <a href="http://www.cpan.org/indices/mirrors.json">mirrors.json</a> please
163    complete the <a href="http://www.cpan.org/MIRRORED.BY">template</a>
164    confirming the public accessible URLs to your mirror: rsync, ftp, http and
165    email it to cpan@perl.org.
166    </li>
167</ul>
168
169<h2>
170    <a name="Instant_mirroring" id="Instant_mirroring">Instant mirroring</a>
171</h2>
172<p>
173    "Instant mirroring" keeps your CPAN mirror up-to-date by continuously
174    tracking the CPAN master; picking up the changes from the master, a short
175    time (minutes) after they occur.
176</p>
177<p>
178    Instant mirroring is used for all Tier 1 mirrors (so
179    cpan-rsync.perl.org stays in sync across mirrors).
180</p>
181<p>
182    To use "instant mirroring", you need a special client: "rrr-client" or
183    "iim".
184</p>
185<p>
186    <a href="https://metacpan.org/pod/distribution/File-Rsync-Mirror-Recent/bin/rrr-client">"rrr-client"</a> is part of the <a href=
187    "https://metacpan.org/release/File-Rsync-Mirror-Recent">File::Rsync::Mirror::Recent</a>
188    (also known as <code>rrr</code>) package ; it is the official client, used
189    on the CPAN master to get updates from <a href=
190    "http://pause.perl.org/">PAUSE</a> : the true heart and soul of "all things
191    perl", see the <a href=
192    "https://github.com/perlorg/cpanorg/wiki/Instant-update-mirroring">setup
193    guide</a> for more details.
194</p>
195<p>
196    <a href="http://www.staff.science.uu.nl/~penni101/iim">"iim"</a> is an
197    alternative for "rrr-client" ; basically it does the same thing, but it is
198    more efficient (on start-up) and has some features that may be helpful to
199    CPAN mirror operators.
200</p>