CRAN Mirror HOWTO/FAQ

This page explains how to create a new CRAN mirror, which is fairly simple. If you would like to become an official CRAN mirror, please be sure to read and follow these instructions carefully. You should have the consent of your hosting company (if you aren't a hosting company yourself), and be prepared for some reasonably significant bandwidth usage. The full size of CRAN was approx 240 GB on 2020-04-24 (and we are growing all the time).

We currently have no written set of rules when we accept a new mirror into the official list. PHP accepts only up to two mirrors per country, we think there may be need to treat China different from, say, Luxembourg. So use common sense and ask yourself whether your mirror helps the R community. We want good global coverage, but also short lists on the mirror webpage or in a GUI. In addition, human time is involved in maintaining the list and monitoring it. If there is no mirror in your country, it will usually be accepted. Otherwise ask first if in doubt.


Where do I get a copy of CRAN?

The CRAN master site at WU Wien, Austria, can be found at the URLs
https://cran.r-project.org
ftp://cran.r-project.org/pub/R/
rsync: cran.r-project.org:

All you have to do is recursively mirror the complete tree to your webserver on a regular basis (at least twice a week, better every 1-2 days, but not more than twice a day). Which software you use for mirroring depends on the operating system of your server, but we strongly recommend that you use rsync. For security reasons we furthermore recommend mirroring over an SSH tunnel. You may want to call rsync using the following arguments:

 rsync -e "ssh" -rptlzv --delete cran-rsync@cran.r-project.org: /dir/on/local/disc 
or (potentially insecure):
 rsync -rptlzv --delete cran.r-project.org::CRAN /dir/on/local/disc 

For rsync over ssh please send your public SSH key to cran-sysadmin@r-project.org in advance (only requests from organizations are considered) and do not forget the --delete flag to remove files from the mirror that are no longer present on the master.

The CRAN tree uses symbolic links, and so rysnc may not work as expected on a Windows server. It may be necessary to replace -l by -L in the above (and this will also be necessary for some partial mirrors, e.g., those excluding the contrib/Archive area).

It is a good practice to consider your file system permissions/users schema beforehand to ensure that every synchronization will be successful and that afterwards the server software (e.g. Apache) will be able to access all the files required. Depending on your server environment it might be achieved by careful planning, adjusting permissions/ownership in your rsync script, or by additional parameters to rsync itself.


Server configuration

CRAN contains no dynamic pages, so in general no special configuration of your web server is needed. However, there are few additional settings and some settings to check.


❏ Display the common index files (usually the default setting).


❏ Files with extension .shtml need to be recognised as HTML pages (usually the default setting). The main frame of the CRAN top page banner.shtml has this extension, so you will immediately see it in case something is wrong.


❏ Generate listings of the directories in /src and in /bin.

For Apache servers, if .htaccess files in the cran directory are enabled (will slow down your server), it should work automatically. Alternatively (recommended) add the following to the apache configuration.

<Directory [your cran directory]/src>
        Options +Indexes
</Directory>

<Directory [your cran directory]/bin>
        Options +Indexes
</Directory>


Optional server configuration

If you would like to promote the hosting institution of the mirror, you can use the environmental variable CRAN_HOST.

In that case, you would need to enable server side includes (without execution).

If you have an Apache 2.4+ server, here is what you would need to include in your configuration.

  SetEnv CRAN_HOST "This server is hosted by your organization ..."
The string "This server ..." (which may contain HTML markup) will be added in the footer of the CRAN top page, see the main server for an example.

You would additionally need

Options +IncludesNOEXEC
in the corresponding <Directory> section, as also
  #
  # To use server-parsed HTML files
  #
  AddType text/html .shtml
<IfModule mod_include.c>
  AddOutputFilter INCLUDES .shtml
</IfModule>
in the MIME-types section of the Apache configuration. The exact syntax depends on the version of Apache. All you have to do is uncomment these (or similar) lines in the default configuration.


Inform us!

Once your mirror is up and running and the automatic updates work for a couple of days send email to cran@r-project.org such that we can include your site in the list of mirrors. Please include the following information in your email:

Thanks in advance for providing webspace for the R Project!


Last modified: April 24, 2020 by Gennadiy Starostin