PageRenderTime 55ms CodeModel.GetById 18ms RepoModel.GetById 0ms app.codeStats 0ms

/doc/source/install.rst

https://github.com/hoffstein/pandas
ReStructuredText | 322 lines | 227 code | 95 blank | 0 comment | 0 complexity | 1a1cfb69fe1ca009d5b84012d6f5d3eb MD5 | raw file
  1. .. _install:
  2. .. currentmodule:: pandas
  3. ============
  4. Installation
  5. ============
  6. The easiest way for the majority of users to install pandas is to install it
  7. as part of the `Anaconda <http://docs.continuum.io/anaconda/>`__ distribution, a
  8. cross platform distribution for data analysis and scientific computing.
  9. This is the recommended installation method for most users.
  10. Instructions for installing from source,
  11. `PyPI <http://pypi.python.org/pypi/pandas>`__, various Linux distributions, or a
  12. `development version <http://github.com/pydata/pandas>`__ are also provided.
  13. Python version support
  14. ----------------------
  15. Officially Python 2.7, 3.4, and 3.5
  16. Installing pandas
  17. -----------------
  18. Trying out pandas, no installation required!
  19. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  20. The easiest way to start experimenting with pandas doesn't involve installing
  21. pandas at all.
  22. `Wakari <https://wakari.io>`__ is a free service that provides a hosted
  23. `IPython Notebook <http://ipython.org/notebook.html>`__ service in the cloud.
  24. Simply create an account, and have access to pandas from within your brower via
  25. an `IPython Notebook <http://ipython.org/notebook.html>`__ in a few minutes.
  26. .. _install.anaconda:
  27. Installing pandas with Anaconda
  28. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  29. Installing pandas and the rest of the `NumPy <http://www.numpy.org/>`__ and
  30. `SciPy <http://www.scipy.org/>`__ stack can be a little
  31. difficult for inexperienced users.
  32. The simplest way to install not only pandas, but Python and the most popular
  33. packages that make up the `SciPy <http://www.scipy.org/>`__ stack
  34. (`IPython <http://ipython.org/>`__, `NumPy <http://www.numpy.org/>`__,
  35. `Matplotlib <http://matplotlib.org/>`__, ...) is with
  36. `Anaconda <http://docs.continuum.io/anaconda/>`__, a cross-platform
  37. (Linux, Mac OS X, Windows) Python distribution for data analytics and
  38. scientific computing.
  39. After running a simple installer, the user will have access to pandas and the
  40. rest of the `SciPy <http://www.scipy.org/>`__ stack without needing to install
  41. anything else, and without needing to wait for any software to be compiled.
  42. Installation instructions for `Anaconda <http://docs.continuum.io/anaconda/>`__
  43. `can be found here <http://docs.continuum.io/anaconda/install.html>`__.
  44. A full list of the packages available as part of the
  45. `Anaconda <http://docs.continuum.io/anaconda/>`__ distribution
  46. `can be found here <http://docs.continuum.io/anaconda/pkg-docs.html>`__.
  47. An additional advantage of installing with Anaconda is that you don't require
  48. admin rights to install it, it will install in the user's home directory, and
  49. this also makes it trivial to delete Anaconda at a later date (just delete
  50. that folder).
  51. .. _install.miniconda:
  52. Installing pandas with Miniconda
  53. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  54. The previous section outlined how to get pandas installed as part of the
  55. `Anaconda <http://docs.continuum.io/anaconda/>`__ distribution.
  56. However this approach means you will install well over one hundred packages
  57. and involves downloading the installer which is a few hundred megabytes in size.
  58. If you want to have more control on which packages, or have a limited internet
  59. bandwidth, then installing pandas with
  60. `Miniconda <http://conda.pydata.org/miniconda.html>`__ may be a better solution.
  61. `Conda <http://conda.pydata.org/docs/>`__ is the package manager that the
  62. `Anaconda <http://docs.continuum.io/anaconda/>`__ distribution is built upon.
  63. It is a package manager that is both cross-platform and language agnostic
  64. (it can play a similar role to a pip and virtualenv combination).
  65. `Miniconda <http://conda.pydata.org/miniconda.html>`__ allows you to create a
  66. minimal self contained Python installation, and then use the
  67. `Conda <http://conda.pydata.org/docs/>`__ command to install additional packages.
  68. First you will need `Conda <http://conda.pydata.org/docs/>`__ to be installed and
  69. downloading and running the `Miniconda
  70. <http://conda.pydata.org/miniconda.html>`__
  71. will do this for you. The installer
  72. `can be found here <http://conda.pydata.org/miniconda.html>`__
  73. The next step is to create a new conda environment (these are analogous to a
  74. virtualenv but they also allow you to specify precisely which Python version
  75. to install also). Run the following commands from a terminal window::
  76. conda create -n name_of_my_env python
  77. This will create a minimal environment with only Python installed in it.
  78. To put your self inside this environment run::
  79. source activate name_of_my_env
  80. On Windows the command is::
  81. activate name_of_my_env
  82. The final step required is to install pandas. This can be done with the
  83. following command::
  84. conda install pandas
  85. To install a specific pandas version::
  86. conda install pandas=0.13.1
  87. To install other packages, IPython for example::
  88. conda install ipython
  89. To install the full `Anaconda <http://docs.continuum.io/anaconda/>`__
  90. distribution::
  91. conda install anaconda
  92. If you require any packages that are available to pip but not conda, simply
  93. install pip, and use pip to install these packages::
  94. conda install pip
  95. pip install django
  96. Installing from PyPI
  97. ~~~~~~~~~~~~~~~~~~~~
  98. pandas can be installed via pip from
  99. `PyPI <http://pypi.python.org/pypi/pandas>`__.
  100. ::
  101. pip install pandas
  102. This will likely require the installation of a number of dependencies,
  103. including NumPy, will require a compiler to compile required bits of code,
  104. and can take a few minutes to complete.
  105. Installing using your Linux distribution's package manager.
  106. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  107. The commands in this table will install pandas for Python 2 from your distribution.
  108. To install pandas for Python 3 you may need to use the package ``python3-pandas``.
  109. .. csv-table::
  110. :header: "Distribution", "Status", "Download / Repository Link", "Install method"
  111. :widths: 10, 10, 20, 50
  112. Debian, stable, `official Debian repository <http://packages.debian.org/search?keywords=pandas&searchon=names&suite=all&section=all>`__ , ``sudo apt-get install python-pandas``
  113. Debian & Ubuntu, unstable (latest packages), `NeuroDebian <http://neuro.debian.net/index.html#how-to-use-this-repository>`__ , ``sudo apt-get install python-pandas``
  114. Ubuntu, stable, `official Ubuntu repository <http://packages.ubuntu.com/search?keywords=pandas&searchon=names&suite=all&section=all>`__ , ``sudo apt-get install python-pandas``
  115. Ubuntu, unstable (daily builds), `PythonXY PPA <https://code.launchpad.net/~pythonxy/+archive/pythonxy-devel>`__; activate by: ``sudo add-apt-repository ppa:pythonxy/pythonxy-devel && sudo apt-get update``, ``sudo apt-get install python-pandas``
  116. OpenSuse & Fedora, stable, `OpenSuse Repository <http://software.opensuse.org/package/python-pandas?search_term=pandas>`__ , ``zypper in python-pandas``
  117. Installing from source
  118. ~~~~~~~~~~~~~~~~~~~~~~
  119. See the :ref:`contributing documentation <contributing>` for complete instructions on building from the git source tree. Further, see :ref:`creating a development environment <contributing.dev_env>` if you wish to create a *pandas* development environment.
  120. Running the test suite
  121. ~~~~~~~~~~~~~~~~~~~~~~
  122. pandas is equipped with an exhaustive set of unit tests covering about 97% of
  123. the codebase as of this writing. To run it on your machine to verify that
  124. everything is working (and you have all of the dependencies, soft and hard,
  125. installed), make sure you have `nose
  126. <http://readthedocs.org/docs/nose/en/latest/>`__ and run:
  127. ::
  128. >>> import pandas as pd
  129. >>> pd.test()
  130. Running unit tests for pandas
  131. pandas version 0.18.0
  132. numpy version 1.10.2
  133. pandas is installed in pandas
  134. Python version 2.7.11 |Continuum Analytics, Inc.|
  135. (default, Dec 6 2015, 18:57:58) [GCC 4.2.1 (Apple Inc. build 5577)]
  136. nose version 1.3.7
  137. ..................................................................S......
  138. ........S................................................................
  139. .........................................................................
  140. ----------------------------------------------------------------------
  141. Ran 9252 tests in 368.339s
  142. OK (SKIP=117)
  143. Dependencies
  144. ------------
  145. * `setuptools <http://pythonhosted.org/setuptools>`__
  146. * `NumPy <http://www.numpy.org>`__: 1.7.1 or higher
  147. * `python-dateutil <http://labix.org/python-dateutil>`__: 1.5 or higher
  148. * `pytz <http://pytz.sourceforge.net/>`__: Needed for time zone support
  149. .. _install.recommended_dependencies:
  150. Recommended Dependencies
  151. ~~~~~~~~~~~~~~~~~~~~~~~~
  152. * `numexpr <https://github.com/pydata/numexpr>`__: for accelerating certain numerical operations.
  153. ``numexpr`` uses multiple cores as well as smart chunking and caching to achieve large speedups.
  154. If installed, must be Version 2.1 or higher (excluding a buggy 2.4.4). Version 2.4.6 or higher is highly recommended.
  155. * `bottleneck <http://berkeleyanalytics.com/bottleneck>`__: for accelerating certain types of ``nan``
  156. evaluations. ``bottleneck`` uses specialized cython routines to achieve large speedups.
  157. .. note::
  158. You are highly encouraged to install these libraries, as they provide large speedups, especially
  159. if working with large data sets.
  160. .. _install.optional_dependencies:
  161. Optional Dependencies
  162. ~~~~~~~~~~~~~~~~~~~~~
  163. * `Cython <http://www.cython.org>`__: Only necessary to build development
  164. version. Version 0.19.1 or higher.
  165. * `SciPy <http://www.scipy.org>`__: miscellaneous statistical functions
  166. * `xarray <http://xarray.pydata.org>`__: pandas like handling for > 2 dims, needed for converting Panels to xarray objects. Version 0.7.0 or higher is recommended.
  167. * `PyTables <http://www.pytables.org>`__: necessary for HDF5-based storage. Version 3.0.0 or higher required, Version 3.2.1 or higher highly recommended.
  168. * `SQLAlchemy <http://www.sqlalchemy.org>`__: for SQL database support. Version 0.8.1 or higher recommended. Besides SQLAlchemy, you also need a database specific driver. You can find an overview of supported drivers for each SQL dialect in the `SQLAlchemy docs <http://docs.sqlalchemy.org/en/latest/dialects/index.html>`__. Some common drivers are:
  169. - `psycopg2 <http://initd.org/psycopg/>`__: for PostgreSQL
  170. - `pymysql <https://github.com/PyMySQL/PyMySQL>`__: for MySQL.
  171. - `SQLite <https://docs.python.org/3.5/library/sqlite3.html>`__: for SQLite, this is included in Python's standard library by default.
  172. * `matplotlib <http://matplotlib.sourceforge.net/>`__: for plotting
  173. * `openpyxl <http://packages.python.org/openpyxl/>`__, `xlrd/xlwt <http://www.python-excel.org/>`__: Needed for Excel I/O
  174. * `XlsxWriter <https://pypi.python.org/pypi/XlsxWriter>`__: Alternative Excel writer
  175. * `Jinja2 <http://jinja.pocoo.org/>`__: Template engine for conditional HTML formatting.
  176. * `boto <https://pypi.python.org/pypi/boto>`__: necessary for Amazon S3
  177. access.
  178. * `blosc <https://pypi.python.org/pypi/blosc>`__: for msgpack compression using ``blosc``
  179. * One of `PyQt4
  180. <http://www.riverbankcomputing.com/software/pyqt/download>`__, `PySide
  181. <http://qt-project.org/wiki/Category:LanguageBindings::PySide>`__, `pygtk
  182. <http://www.pygtk.org/>`__, `xsel
  183. <http://www.vergenet.net/~conrad/software/xsel/>`__, or `xclip
  184. <http://sourceforge.net/projects/xclip/>`__: necessary to use
  185. :func:`~pandas.io.clipboard.read_clipboard`. Most package managers on Linux distributions will have ``xclip`` and/or ``xsel`` immediately available for installation.
  186. * Google's `python-gflags <http://code.google.com/p/python-gflags/>`__ ,
  187. `oauth2client <https://github.com/google/oauth2client>`__ ,
  188. `httplib2 <http://pypi.python.org/pypi/httplib2>`__
  189. and `google-api-python-client <http://github.com/google/google-api-python-client>`__
  190. : Needed for :mod:`~pandas.io.gbq`
  191. * `Backports.lzma <https://pypi.python.org/pypi/backports.lzma/>`__: Only for Python 2, for writing to and/or reading from an xz compressed DataFrame in CSV; Python 3 support is built into the standard library.
  192. * One of the following combinations of libraries is needed to use the
  193. top-level :func:`~pandas.io.html.read_html` function:
  194. * `BeautifulSoup4`_ and `html5lib`_ (Any recent version of `html5lib`_ is
  195. okay.)
  196. * `BeautifulSoup4`_ and `lxml`_
  197. * `BeautifulSoup4`_ and `html5lib`_ and `lxml`_
  198. * Only `lxml`_, although see :ref:`HTML reading gotchas <html-gotchas>`
  199. for reasons as to why you should probably **not** take this approach.
  200. .. warning::
  201. * if you install `BeautifulSoup4`_ you must install either
  202. `lxml`_ or `html5lib`_ or both.
  203. :func:`~pandas.io.html.read_html` will **not** work with *only*
  204. `BeautifulSoup4`_ installed.
  205. * You are highly encouraged to read :ref:`HTML reading gotchas
  206. <html-gotchas>`. It explains issues surrounding the installation and
  207. usage of the above three libraries
  208. * You may need to install an older version of `BeautifulSoup4`_:
  209. - Versions 4.2.1, 4.1.3 and 4.0.2 have been confirmed for 64 and
  210. 32-bit Ubuntu/Debian
  211. * Additionally, if you're using `Anaconda`_ you should definitely
  212. read :ref:`the gotchas about HTML parsing libraries <html-gotchas>`
  213. .. note::
  214. * if you're on a system with ``apt-get`` you can do
  215. .. code-block:: sh
  216. sudo apt-get build-dep python-lxml
  217. to get the necessary dependencies for installation of `lxml`_. This
  218. will prevent further headaches down the line.
  219. .. _html5lib: https://github.com/html5lib/html5lib-python
  220. .. _BeautifulSoup4: http://www.crummy.com/software/BeautifulSoup
  221. .. _lxml: http://lxml.de
  222. .. _Anaconda: https://store.continuum.io/cshop/anaconda
  223. .. note::
  224. Without the optional dependencies, many useful features will not
  225. work. Hence, it is highly recommended that you install these. A packaged
  226. distribution like `Anaconda <http://docs.continuum.io/anaconda/>`__, or `Enthought Canopy
  227. <http://enthought.com/products/canopy>`__ may be worth considering.