- Python Data Science - Matplotlib
- Python Data Science - SciPy
- Python Data Science - Numpy
- Python Data Science - Pandas
- Python Data Science - Environment Setup
- Python Data Science - Getting Started
- Python Data Science - Home
Python Data Processing
- Python Stemming and Lemmatization
- Python word tokenization
- Python Processing Unstructured Data
- Python Reading HTML Pages
- Python Data Aggregation
- Python Data Wrangling
- Python Date and Time
- Python NoSQL Databases
- Python Relational databases
- Python Processing XLS Data
- Python Processing JSON Data
- Python Processing CSV Data
- Python Data cleansing
- Python Data Operations
Python Data Visualization
- Python Graph Data
- Python Geographical Data
- Python Time Series
- Python 3D Charts
- Python Bubble Charts
- Python Scatter Plots
- Python Heat Maps
- Python Box Plots
- Python Chart Styling
- Python Chart Properties
Statistical Data Analysis
- Python Linear Regression
- Python Chi-square Test
- Python Correlation
- Python P-Value
- Python Bernoulli Distribution
- Python Poisson Distribution
- Python Binomial Distribution
- Python Normal Distribution
- Python Measuring Variance
- Python Measuring Central Tendency
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
Python Data Science - Environment Setup
To successfully create and run the example code in this tutorial we will need an environment set up which will have both general-purpose python as well as the special packages required for Data science. We will first look as instalpng the general-purpose python which can be python 2 or python 3. But we will prefer python 2 for this tutorial mainly because of its maturity and wider support of external packages.
Getting Python
The most up-to-date and current source code, binaries, documentation, news, etc., is available on the official website of Python
You can download Python documentation from
. The documentation is available in HTML, PDF, and PostScript formats.Instalpng Python
Python distribution is available for a wide variety of platforms. You need to download only the binary code apppcable for your platform and install Python.
If the binary code for your platform is not available, you need a C compiler to compile the source code manually. Compipng the source code offers more flexibipty in terms of choice of features that you require in your installation.
Here is a quick overview of instalpng Python on various platforms −
Unix and Linux Installation
Here are the simple steps to install Python on Unix/Linux machine.
Open a Web browser and go to
.Follow the pnk to download zipped source code available for Unix/Linux.
Download and extract files.
Editing the Modules/Setup file if you want to customize some options.
run ./configure script
make
make install
This installs Python at standard location /usr/local/bin and its pbraries at /usr/local/pb/pythonXX where XX is the version of Python.
Windows Installation
Here are the steps to install Python on Windows machine.
Open a Web browser and go to
.Follow the pnk for the Windows installer python-XYZ.msi file where XYZ is the version you need to install.
To use this installer python-XYZ.msi, the Windows system must support Microsoft Installer 2.0. Save the installer file to your local machine and then run it to find out if your machine supports MSI.
Run the downloaded file. This brings up the Python install wizard, which is really easy to use. Just accept the default settings, wait until the install is finished, and you are done.
Macintosh Installation
Recent Macs come with Python installed, but it may be several years out of date. See
for instructions on getting the current version along with extra tools to support development on the Mac. For older Mac OS s before Mac OS X 10.3 (released in 2003), MacPython is available.Jack Jansen maintains it and you can have full access to the entire documentation at his website −
. You can find complete installation details for Mac OS installation.Setting up PATH
Programs and other executable files can be in many directories, so operating systems provide a search path that psts the directories that the OS searches for executables.
The path is stored in an environment variable, which is a named string maintained by the operating system. This variable contains information available to the command shell and other programs.
The path variable is named as PATH in Unix or Path in Windows (Unix is case sensitive; Windows is not).
In Mac OS, the installer handles the path details. To invoke the Python interpreter from any particular directory, you must add the Python directory to your path.
Setting path at Unix/Linux
To add the Python directory to the path for a particular session in Unix −
In the csh shell − type setenv PATH "$PATH:/usr/local/bin/python" and press Enter.
In the bash shell (Linux) − type export ATH="$PATH:/usr/local/bin/python" and press Enter.
In the sh or ksh shell − type PATH="$PATH:/usr/local/bin/python" and press Enter.
Note − /usr/local/bin/python is the path of the Python directory
Setting path at Windows
To add the Python directory to the path for a particular session in Windows −
At the command prompt − type path %path%;C:Python and press Enter.
Note − C:Python is the path of the Python directory
Python Environment Variables
Here are important environment variables, which can be recognized by Python −
Sr.No. | Variable & Description |
---|---|
1 | PYTHONPATH It has a role similar to PATH. This variable tells the Python interpreter where to locate the module files imported into a program. It should include the Python source pbrary directory and the directories containing Python source code. PYTHONPATH is sometimes preset by the Python installer. |
2 | PYTHONSTARTUP It contains the path of an initiapzation file containing Python source code. It is executed every time you start the interpreter. It is named as .pythonrc.py in Unix and it contains commands that load utipties or modify PYTHONPATH. |
3 | PYTHONCASEOK It is used in Windows to instruct Python to find the first case-insensitive match in an import statement. Set this variable to any value to activate it. |
4 | PYTHONHOME It is an alternative module search path. It is usually embedded in the PYTHONSTARTUP or PYTHONPATH directories to make switching module pbraries easy. |
Running Python
There are three different ways to start Python −
Interactive Interpreter
You can start Python from Unix, DOS, or any other system that provides you a command-pne interpreter or shell window.
Enter python the command pne.
Start coding right away in the interactive interpreter.
$python # Unix/Linux or python% # Unix/Linux or C:> python # Windows/DOS
Here is the pst of all the available command pne options −
Sr.No. | Option & Description |
---|---|
1 | -d It provides debug output. |
2 | -O It generates optimized bytecode (resulting in .pyo files). |
3 | -S Do not run import site to look for Python paths on startup. |
4 | -v verbose output (detailed trace on import statements). |
5 | -X disable class-based built-in exceptions (just use strings); obsolete starting with version 1.6. |
6 | -c cmd run Python script sent in as cmd string |
7 | file run Python script from given file |
Script from the Command-pne
A Python script can be executed at command pne by invoking the interpreter on your apppcation, as in the following −
$python script.py # Unix/Linux or python% script.py # Unix/Linux or C: >python script.py # Windows/DOS
Note − Be sure the file permission mode allows execution.
Integrated Development Environment
You can run Python from a Graphical User Interface (GUI) environment as well, if you have a GUI apppcation on your system that supports Python.
Unix − IDLE is the very first Unix IDE for Python.
Windows − PythonWin is the first Windows interface for Python and is an IDE with a GUI.
Macintosh − The Macintosh version of Python along with the IDLE IDE is available from the main website, downloadable as either MacBinary or BinHex d files.
Instalpng SciPy Pack
The best way to enable the required packs is to use an installable binary package specific to your operating system. These binaries contain full SciPy stack (inclusive of NumPy, SciPy, matplotpb, IPython, SymPy and nose packages along with core Python).
Windows
Anaconda (from
) is a free Python distribution for SciPy stack. It is also available for Linux and Mac.Canopy (
) is available as free as well as commercial distribution with full SciPy stack for Windows, Linux and Mac.Python (x,y): It is a free Python distribution with SciPy stack and Spyder IDE for Windows OS. (Downloadable from
)Linux
Package managers of respective Linux distributions are used to install one or more packages in SciPy stack.
For Ubuntu
sudo apt-get install python-numpy python-scipy python-matplotpbipythonipythonnotebook python-pandas python-sympy python-nose
For Fedora
sudo yum install numpyscipy python-matplotpbipython python-pandas sympy python-nose atlas-devel
Building from Source
Core Python (2.6.x, 2.7.x and 3.2.x onwards) must be installed with distutils and zpb module should be enabled.
GNU gcc (4.2 and above) C compiler must be available.
To install NumPy, run the following command.
Python setup.py install
Let us test whether NumPy module is properly installed, try to import it from Python prompt.
If it is not installed, the following error message will be displayed.
Traceback (most recent call last): File "<pyshell#0>", pne 1, in <module> import numpy ImportError: No module named numpy
Similarly we can check for the installation of all the required Data Science packages shown in the next chapters.
Advertisements