A must-read before starting a Google CoLab project

Janeth Fernando
5 min readJun 17, 2021

--

This article provides some valuable tips for newbies who want to start a Google Colab project.

Quick facts to remember

  1. Google CoLab default runtime is a CPU. Go to the Runtime menu and change it to either a GPU or TPU before start training your models.
  2. A free runtime is given only for 12hours. Free RAM allocation is 13GB.
  3. Save your work at each possible time on Google Drive. After the timeout, you will lose the runtime with its internal disk storage.

Now let’s see how we can get the maximum advantage from Google CoLab!!!

1) Mounting your Google Drive

For the first time, you need to provide permission when you connect your Google Drive to Google CoLab. Afterwards, you can straightaway use your Google Drive on the Google CoLab runtime. The following screenshots show how to connect your Google Drive to your Google Colab runtime.

Mount Google Drive
Give permission to connect to Google Colabs

2) Working with python2 version in CoLab

The default python version which Colab currently runs can be found with !python --versioncommand. Google Colab has python3 Jupiter notebooks. They have removed python2 from their GUI and therefore we cannot select our own python environment(python2 or python3). But still, the python2 environments are functional even though it is not visible in the GUI. You can create a python2 environment if you create a new ipynb notebook from the following link.

https://colab.research.google.com/notebook#create=true&language=python2

3) Changing versions of existing packages

Google CoLab has several packages installed for the ease of the users. But these installed package versions might not be suitable for our projects. So, we need to change the versions of the packages.

To check the current version of the Keras package, execute the following command. You can use your package name instead of Keras.

!pip show keras

To change the current Keras version, execute the following command. You can follow the same structure to change the package version you want to change in your package.

!pip install q keras==1.2.2

4) Running bash commands or external python scripts on Jupiter Notebook Cells

You can execute Linux commands on Jupiter Notebook cells. You can add an ! in front of the Linux commands and execute.

But when you use the cd command you need to use it as %cd .

!ls # List the directories and files inside the current directory
!pwd # View the path location of the current directory
!mkdir # Make a new directory on the current directory

Execute your own python scripts on a cell.

!python3 /content/drive/Project/script.py

5) File Management

You can directly upload or download files from the CoLab GUI. Also, when you want to know the exact location of a file you can right-click on the file and copy the path of the file. Refer to the following screenshot to get an idea of how to do that task.

Upload, Download, or Copy path

6) View more system specification of CoLab runtime

Use the following command on a cell to view more details of the CPU.

!cat /proc/cpuinfo

Use the following command on a cell to view more details of the GPU.

!nvidia-smi

7) Direct download datasets to Google Drive from Google CoLab

You can use Google CoLab to directly download datasets from external sources. This is an advantage because it will not use your internet bandwidth to download huge datasets.

Follow the steps below to download datasets from Kaggle. You need to download your Kaggle account API key to a Google Drive directory. Then mount the directory to your Google CoLab. Afterwards, use the following code snippet. /content/drive/MyDrive/kaggle is the location where the API key is located on Google Drive.

import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content/drive/MyDrive/kaggle"
%cd /content/drive/MyDrive/kaggle

Now you can use any dataset download API command to download the dataset directly to Google Drive.

!kaggle datasets download -d saadabbass84/seekcom-data-jobs-for-melbourne
Copy API command option in Kaggle Dataset

If you have a dataset with a public URL on someone else’s Google Drive, you can use the gdown package to download the dataset to your Google Drive from Google Colab. You need to get the ID of the file from the URL and use it as the following code snippet.

!gdown "https://drive.google.com/ucid=1LYU8jCkQCgY7lDCiushicQ4_AvI8xXyd"

8) Get system performance as graphs when training

When you use Colab, You can only see the Jupiter Notebook. But when doing a research, you might need to see how the system performs(CPU, GPU stats) when training the machine models. From Google Colabs you cannot directly see the system performance stats, but you can use an external tool to check the system stats. Follow this method to see the system performance stats as graphs.

Before starting the training process, you need to create an account in wandb. All your CoLab system performance stats will be recorded in your wandb account.

Add the following code snippet to a Jupiter Notebook cell and initialize a session in wandb to record the system performance stats. When the execution is completed it will ask permission from wandb. Once you provide permission you can see all the system performance stats on graphs continuously.

!pip install wandb
import wandb
wandb.init()
wandb dashboard

9) Stay without getting disconnected from the runtime

If Colab detects any inactivity for continuous 30minutes, then CoLab will automatically terminate the runtime. In order to prevent this activity we manually add a script to automatically click on the connect button to prevent CoLab to terminate the runtime. When this activity is successful, you can go away from your machine while your model gets trained! Go to inspect element (ctrl+shift+I), then click on the console tab, add the following code snippet into the console and execute.

function ClickConnect(){
console.log("Clicked on connect button");
document.querySelector("colab-connect-button").click()
}
setInterval(ClickConnect,60000)

10) Share the CoLab Jupiter Notebook on a GitHub repo

You can directly save your CoLab Jupiter Notebook on a GitHub repository.

Go to File -> Save a copy in GitHub. It will ask for GitHub authorization and afterwards, it will ask a repository to add the Jupiter Notebook into.

You can also add a CoLab badge on a GitHub repo README file.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb)

I hope now you can get the maximum advantage to your project from Google CoLab.

Stay Safe and Happy Coding 🥂!

--

--

Janeth Fernando

Senior Site Reliability Engineer @WSO2 | Content Creator 💻