1.2 Python IDEs
To write Python code, we need an integrated development environment (IDE). An IDE is a software application that provides comprehensive assistance to computer programmers for software development such as a code editor, build automation tools, and debugging features. An IDE can be installed locally or available on the cloud.
There are many good IDE options for Python, and you can select any of them for this course. Each of the options I describe below uses the .ipynb file format (based on the Jupyter Notebook IDE format), which is just one of several formats in which to write Python. While these are not the only options available, they are the options that I’ve personally used and would recommend and that many of my friends and associates who are working as professional data scientists also use.
-
Local installation IDEs (i.e., download and install)
-
Jupyter Notebook: the original .ipynb IDE. It appears to run in the cloud because it uses a web browser (e.g., Chrome, IE, or Safari) as its interface. However, it runs on a virtualized server on your own machine.
-
Advantages: It is faster than other local installation IDEs and gives you full control over the packages that you install. Jupyter Notebook is available for PC and Mac, and it comes as part of the Anaconda installation, which manages packages for you.
-
Disadvantages: If your laptop is slow, then Jupyter will run slow. If bugs show up, then it’s up to you to figure out how to fix them. Bugs are not common, but dealing with them can be a pain.
-
-
Visual Studio Code: a free Microsoft product that allows many types of code editing besides Python.
-
Advantages: It integrates very well with web applications and makes it easy to combine data analytics projects with web projects into machine learning environments. Visual Studio Code is good for full-stack developers already working with C# or other Microsoft products. Visual Studio Code has arguably fewer bugs than Jupyter Notebook.
-
Disadvantages: It runs noticeably slower than Jupyter Notebook when processing large data files. The reason for this slowdown is not known, but I’ve confirmed the slowdown with multiple timeit tests.
-
-
-
Cloud IDEs (i.e., use from a website)
-
Google Colab: a free G Suite product that allows you to write and execute Python in your browser.
-
Advantages: This option requires no installation, and there are no bugs that you have to fix. Google Colab is very fast. It beats Juptyer Notebook in speed because Google gives you a 4-core CPU, a GPU, and 20GB of memory for your virtual machine compared to a 2-core CPU, no GPU, and 4 GB memory from Azure Notebooks. Colab actually timed faster than Jupyter Notebook on my laptop with 64 GB of RAM and an Intel core i9 processor. Colab makes it easy to import libraries without installing them first.
-
Disadvantages: It is difficult to import packages that aren’t already loaded in Colab. I found it near impossible to get the pyodcb package (for connecting to Azure SQL Server databases) to work in Colab. Accessing CSV files and importing .py files (with your own functions) takes a bit longer, and the process gets annoying. It is easier to pull CSV files from the web over HTTP.
-
-
To be honest, I have a hard time deciding which IDE to use for a class like this. Feel free to use and get experience with several of them. Most of the videos in this book will be in Google Colab. The only caveat to using Colab is that it will require you to have a free Google Account. If you already have a Google account, you are welcome to use your existing account for this course. You can also follow along with a video walkthrough of Colab here: