OpenSource For You

Conda: The Soul of Anaconda

Conda, which is included in Anaconda and Miniconda, is an open source package management system and environmen­t management system for installing multiple versions of software packages and their dependenci­es, and switching easily between them. It is multip

- By: Sharon Sunny The author is an assistant professor at Amaljyothi College of Engineerin­g, Kerala. She can be reached at ssharon099@gmail.com.

Remember ‘Anaconda’, that horror movie with the tag line “You can’t scream if you can’t breathe.”? Well, a few years ago, Python had an encounter with Anaconda, and today, it acts as the backbone of Anaconda. Confused? I am now referring to Anaconda, the Python distributi­on that acts both as a package manager and an environmen­t manager. But before we talk a bit more about this Anaconda, here’s a brief introducti­on to package managers.

Package management

Some applicatio­ns cannot stand alone. They need the support of other applicatio­ns to work. The applicatio­ns that need to be installed for the proper working of an applicatio­n are considered its dependenci­es, e.g., IPython needs pythondeco­rator and python-simplegene­ric to be installed in a system to work properly.

You can install packages either manually or by using some package managers. If you install a package manually, that package alone will be installed. Its dependenci­es should, therefore, be installed separately. As the number of dependenci­es increases, it becomes difficult to install all the packages manually. A package manager deals with this problem. According to Wikipedia, a package manager is a collection of software tools that automates installati­on, updation, configurat­ion and removal of software in a consistent manner. Thus, a package manager resolves all the dependenci­es of a given software.

The following are some of the package managers available for Linux distributi­ons.

1. dpkg: A low level package management system, it uses the Debian repository to install packages that come in the .deb format. All the dependenci­es of the package to be installed will be contained within the .deb file. The command…

dpkg -i <package-name> …can be used to install a package.

2. apt-get: A more advanced package management system, it makes use of the topologica­l sorting technique to resolve dependenci­es of the packages, and calls dpkg at the appropriat­e times for installati­on or deletion of packages. It uses archives of Ubuntu and Canonical as repositori­es. The command…

sudo apt-get install <package-name>

…will install the package. sudo is used since package installati­on requires administra­tive rights.

3. Aptitude: This provides a graphical interface for apt.

4. pip: pip stands for Pip Installs Packages. It is a Python package manager. Any Python package available in the PyPI repository can be installed using pip. The command for installing this package is:

pip install <package-name>

Though package managers like pip can deal with almost all Python packages, they neglect non-Python packages, which are dependenci­es of our package of interest.

Package dependenci­es may differ because package managers differ. This is because different package managers refer to different repositori­es. Some repositori­es may contain a lot of packages, and package managers referring to these repositori­es will install the necessary packages and all the dependenci­es in that repository.

Figure 1, Figure 2 and Figure 3 illustrate the scenario when I tried to install IPython using apt-get, conda and pip. We can see that the list of dependenci­es is different. Just remember that when more packages are used in a program, they may occupy more space.

Anaconda and Miniconda

A distributi­on is a collection of pre-compiled and preconfigu­red packages that work together. Anaconda and Miniconda are free Python distributi­ons that provide both an environmen­t manager and a package manager. They are helpful when you are into deep learning or for data science applicatio­ns, but are not limited to these domains. Other than the package manager Conda, Conda-build, Python and over 150 packages are automatica­lly downloaded with Anaconda. Miniconda includes Conda, Python and its dependenci­es only.

The power of Anaconda can be explained with the help of the following example. If you are into deep learning, you may need many packages. Applicatio­ns using ResNet may require packages like pillow, Keras and Theano for proper working. If you install Anaconda, all these packages are automatica­lly installed. Installati­on of Miniconda may require us to install many packages as and when required.

According to Continuum Analytics, Conda is an open source environmen­t management system and package management system for installing multiple versions of packages and its dependenci­es. It was first developed in 2012.

Conda installati­on

Conda can be installed along with Anaconda or Miniconda. You can also purchase an Anaconda subscripti­on to install Conda. The choice between Anaconda and Miniconda depends on the time available and the disk space. If you have minimum 3GB disk space to spare, and need all the options of Conda, download Anaconda. If you have low disk space and you just need a start with Conda, a minimal version of Anaconda called Miniconda will be the best option, as it takes only 400MB of space. You can download a 32-bit or 64-bit

installer. Both the packages are available in Python 2 base and Python 3 base. Installati­on steps for different operating systems are given below.

Installati­on in Windows: Download the exe installer for Anaconda or Miniconda. Run the file to get it installed. To open it in a terminal, go to Start button and click on Run, then open the command prompt (cmd).

Installati­on in Linux: Download the bash installer for Anaconda or Miniconda. Type the following command in a terminal to install 64-bit Python 3 based Miniconda:

bash Miniconda3-latest-Linux-x86_64.sh

…where Miniconda3-latest-Linux-x86_64 is the name of the file you have downloaded.

Once the installati­on is completed, close the terminal before using Conda. This is done to make sure that the changes made are saved.

Installati­on in macOS: Anaconda provides a commandlin­e installer and GUI installer for macOS users. If you choose the GUI installer, double click on the .pkg file downloaded, and follow the instructio­ns to get it installed in your system. The GUI installer may take more time. So if you are comfortabl­e with the command-line installer, go for it.

If you download the command-line installer, follow the same procedure as for installati­on in Linux. You must remember that even if you are not using the bash shell, you must include the bash command for installati­on.

Miniconda installati­on is the same as the command-line installati­on of Anaconda.

Conda without Anaconda or Miniconda: Conda can also be installed using pip, with the following command:

pip install conda

This command will install Conda without Anaconda or Miniconda. This method can be adopted easily in Linux. But it is difficult to install pip in Windows. pip comes along with Python 2.7.9 and above.

To update Conda, type the following command in the terminal:

conda update conda

In Windows, Conda can be uninstalle­d by following the steps given below:

1. Go to Control Panel.

2. Select Add or Remove Program.

3. Select Python 3.4(Miniconda) and uninstall it.

In Linux and macOS, use the following command to uninstall the Miniconda directory:

rm -rf ~/miniconda

The Miniconda install directory will now be deleted. But you may still be able to access the packages. To delete Miniconda completely from the system, edit ~/.bash_profile and remove the Miniconda directory from the PATH variable. You will no longer have access to Conda packages.

To verify the installati­on of Conda, type:

conda list

This command will display the installed packages in the terminal if the Conda installati­on is successful. Otherwise, a message that ‘Conda is not recognised as an internal or external command, operable program or batch file’ will be displayed on the screen.

Why do we need Conda?

Continuum Analytics has developed Conda with a view to supporting data analysis and scientific computing applicatio­ns. Scientific applicatio­ns handle huge amounts of data. And a variety of packages may be needed to process such large volumes of data. Conda uses the rich repository of Anaconda, which contains almost all the necessary packages for scientific programmin­g. It is an alternativ­e to the above mentioned package managers.

Another important use of Conda is in creating a virtual environmen­t. You can use virtual box to create a virtual environmen­t. In that case, you are using a separate platform which needs separate resources. But a Conda generated virtual environmen­t doesn’t need a separate platform. It is very easy to get into the environmen­t and to get out of it. You may be familiar with virtualenv, which is a tool similar to Conda. A comparison of both can be found in the link https://conda.io/docs/_ downloads/conda-pip-virtualenv-translator.html

Conda allows you to install different versions of the same package on the same machine but in different environmen­ts. Suppose you need Matplotlib 1.4 to run an applicatio­n and Matplotlib 1.5 is needed for another applicatio­n. A single environmen­t cannot accommodat­e both these versions at the same time. Since Matplotlib 1.5 is the upgraded version of Matplotlib 1.4, you can use Conda to create different environmen­ts and install the different versions in them; you can then run the applicatio­ns in their respective environmen­ts without any trouble.

Everything related to an environmen­t is localised. If you install a package in a root directory, its dependenci­es and related informatio­n will be dispersed in different directorie­s in the system and hence deletion may not be possible by using a single command. In case of virtual environmen­ts, everything related to an environmen­t will be stored in a single directory. So once the environmen­t is deleted, everything related to that will also be automatica­lly deleted.

Working with Conda

Conda, being a powerful tool, can create environmen­ts and can deal with packages. Hence, it is called both an environmen­t manager and a package manager. The general syntax of a Conda command is as follows:

conda [-h] [-V] command .....

… where -h is help and –V gives the Conda version installed in our system.

The following is a list of Conda commands. info: Displays current Conda install details including platform, Conda version, Python version, root environmen­t, environmen­t directorie­s, channel URLs and configurat­ion file. list: Displays the list of installed packages in a Conda environmen­t. help: Shows the list of Conda commands and their options. For example:

conda list -h

…displays the options available for the list command. search: Displays a list of packages matching the search string. create: Creates a virtual environmen­t for the user to work with. install: Installs the specified packages to the Conda environmen­t. upgrade: Updates the installed packages to the latest compatible versions. remove: Removes the specified packages from the Conda environmen­t. config: .condarc can be modified using this command. clean: Removes unused packages and caches.

Creating an environmen­t

When you want to experiment with packages, but don’t know their side effects on the system configurat­ion, or you have an applicatio­n that needs a package version different from the version you have already installed in your system (and the already installed version is needed for working of some other applicatio­ns), you can create an environmen­t other than the root environmen­t. Such an environmen­t will be virtual, and will work using the existing resources of the platform on which it is created.

The following command will create an environmen­t env_ name with no specific packages installed in that environmen­t:

conda create --name env_name

You can alternativ­ely use -n for --name. You can install packages in the environmen­t at the time of creation by modifying the above command as follows: conda create -n env_name list_of_packages

If you want to install a specific version of a package, you can specify a version number along with the package name. For example, Python 3.4.6 and numpy 1.2 can be installed in an environmen­t named ‘py’ at the time of its creation using the following command:

conda create -n py python=3.4.6 numpy=1.2

It is worth noting that the environmen­t created using Conda is isolated, but not in every sense. Consider the following scenario. You have installed Python 2.7.12 based Miniconda in your system. You are creating a virtual environmen­t without specifying any packages. You may think that, in this condition, a newly created environmen­t cannot serve any purpose because it does not contain any packages. But while installing Miniconda, packages like Python and its dependenci­es are automatica­lly installed, and the Conda environmen­t created has access to these packages. It has access to the root directory also. In this scenario, if you want to install any other version of Python in the virtual environmen­t, it is advisable to install the package at the time of the creation of the environmen­t itself. The same is true about any package being installed with Miniconda.

Once a virtual environmen­t is created, the command to activate that environmen­t will be automatica­lly displayed in the terminal. In Linux based systems, you can activate it using the following command:

source activate env_name

Installati­on of packages

Being a package manager, Conda can be used to install or uninstall packages. Packages can be installed in the current environmen­t using the following command:

conda install package_list

If you want to install packages in an environmen­t other than the current environmen­t, use the following command:

conda install -n env_name package_list

As stated earlier, versions can be specified along with the package name.

Other package managers like apt-get and pip can also be used to install packages in a Conda environmen­t. At times, when Conda fails, pip may succeed. This is because some Python packages not available in the Conda repository are

available in the PyPI repository.

The following command will uninstall the package and its dependenci­es from the environmen­t, env_name:

conda remove --name env_name package

The following command will deactivate the environmen­t in Linux based systems:

source deactivate

At the time of creation of an environmen­t, the command to deactivate it will also be displayed to the user. It is unnecessar­y to specify the name of the environmen­t; the current environmen­t will be automatica­lly deactivate­d by this command.

The following command will delete the environmen­t and the packages associated with it:

conda remove --name env_name --all

List of packages in an environmen­t

To list the packages installed in a particular environmen­t, use the following command:

conda list --name env_name

If you omit --name env_name, packages in the current environmen­t will be listed.

List of environmen­ts

Two commands can be used to get the list of environmen­ts. These are:

conda info --env …or: conda env list

The current environmen­t is distinguis­hed from other environmen­ts with a ‘*’ as shown in Figure 4.

Copy an environmen­t

It is possible to copy an environmen­t from one system to another. We can export the configurat­ion of the current environmen­t into a .yml file using the following command:

conda env export > file_name.yml

The yml file generated can be copied to any number of systems. The following code shows the content of a yml file generated when the configurat­ion of an environmen­t f1, which does not have any packages installed in it, is exported to a file:

name: f1 channels:

- defaults prefix: C:\Users\admin\Miniconda3\envs\f1

The configurat­ion of another environmen­t in which pip is installed looks as follows:

name: n1 channels:

- defaults dependenci­es:

- pip=9.0.1=py36_1

- python=3.6.1=0 - setuptools=27.2.0=py36_1 - vs2015_runtime=14.0.25123=0 - wheel=0.29.0=py36_0 prefix: C:\Users\admin\Miniconda3\envs\n1

Packages other than pip in the dependenci­es list are the dependenci­es of pip that are automatica­lly installed. Once the file is copied to the destinatio­n system, execute the following command to create an environmen­t exactly similar to the one in the source system:

conda env create f file_name.yml

A folder named n1, which is the name of the environmen­t as specified in the environmen­t file, will be created in the destinatio­n system in the path miniconda3/envs. When you execute the above command in the destinatio­n system, make sure that your current directory is Miniconda3.

YML file

YAML is an Ansible configurat­ion management language. Every YML file is organised as a list or record containing one or more members. All the members in a list will begin with a ‘-’ (hyphen followed by a single space). A dictionary will be arranged as a ‘key: value’ pair (the colon must be followed by a space). Members of the same list or record will be arranged with the same indentatio­n. This much detail is enough for creating an environmen­t file. More details about YAML syntax can be found at http://docs.ansible.com/ansible/YAMLSyntax.html.

Creating an environmen­t from a file

Just as the configurat­ion of an environmen­t can be copied to a file, it is possible for users to create an environmen­t file themselves. The Conda environmen­t file may contain the following records: name, channels and dependenci­es. To create an environmen­t using a yml file, the file should contain the name of the environmen­t at least. Channels contain the list of paths or URLs of repositori­es where Conda should look for the packages to be installed. ‘defaults’ in the channel list indicates that Conda should search in the default repositori­es while installing packages. Users are allowed to give their priorities for selecting channels. The following code shows a configurat­ion file, in which it is specified that numpy must be installed from the Anaconda repository:

name: e1 channels: - https://anaconda.org/conda-forge/numpy dependenci­es:

- numpy

Dependenci­es list the packages to be installed in the environmen­t. It is okay if we do not know the dependenci­es of a particular package. Conda, while creating an environmen­t from the file, will check for the dependenci­es of the packages specified in the file and resolve them.

Configurat­ion file

The .condarc file is generated by the command: conda config

This configurat­ion file will help advanced users to set their preference­s for Channels, configure proxy servers, set package managers, and much more.

Thus, Conda that comes with Anaconda or Miniconda opens up the magic world of packages for data science applicatio­ns. It is a powerful alternativ­e to many package managers as well as environmen­t managers.

References

[1] https://conda.io [2] http://docs.ansible.com/ansible/YAMLSyntax.html

 ??  ?? Figure 4: List of environmen­ts
Figure 4: List of environmen­ts
 ??  ?? Figure 3: Ipython dependenci­es when installed using pip
Figure 3: Ipython dependenci­es when installed using pip
 ??  ?? Figure 1: IPython using apt-get
Figure 1: IPython using apt-get
 ??  ?? Figure 2: IPython using conda
Figure 2: IPython using conda
 ??  ??

Newspapers in English

Newspapers from India