Interactive DataMapPlot Colour Options
This notebook will walk you through some of the colour specific customization options that are available in DataMapPlot interactive plots. There are many options, so this notebook will instead highlight some of the major options and hint at the further customization that can be achieved with respect to them. To get started we’ll need to import DataMapPlot.
[1]:
import datamapplot
To demonstrate what DataMapPlot interactive plots can do we’ll need some data. The examples directory of the DataMapPlot repository contains some pre-prepared datasets for experimenting with. We’ll grab one of those. Much like the static plotting we need a data map – as set of 2d coordinates, one per data sample we are mapping – and at least one set of labels idenityfing the “topic” of a data sample, usually based on clusters in the data map. In this case we’ll use a data map derived from the CORD-19 dataset – a dataset of papers and scientific articles related to COVID-19 that was curated by Allen AI.
[2]:
import numpy as np
import requests
import io
base_url = "https://github.com/TutteInstitute/datamapplot"
data_map_file = requests.get(
f"{base_url}/raw/main/examples/CORD19-subset-data-map.npy"
)
cord19_data_map = np.load(io.BytesIO(data_map_file.content))
label_file = requests.get(
f"{base_url}/raw/interactive/examples/CORD19-subset-cluster_labels.npy"
)
cord19_labels = np.load(io.BytesIO(label_file.content), allow_pickle=True)
Let’s start by making a basic interactive plot with DataMapPlot. This will give us an idea of what the starting point looks like, and can better understand what the various customizations we will be applying can do for us.
[3]:
plot = datamapplot.create_interactive_plot(
cord19_data_map,
cord19_labels,
initial_zoom_fraction=0.33,
)
plot
[3]:
Similarly to the static plots, by default interactive plots in DataMapPlot colour the text labels to match with the associated clusters in the data map. This can be useful for distinguishing the different labels, and making the cluster associations, but can be distracting. We can turn that off by setting color_label_text
to False
.
[4]:
plot = datamapplot.create_interactive_plot(
cord19_data_map,
cord19_labels,
initial_zoom_fraction=0.33,
color_label_text=False,
)
plot
[4]:
In contrast to the static plots, the interactive plotting supports drawing cluster boundaries (using berzier smoothed alpha-shapes). This can be particularly helpful when there are multiple layers of clusters as the smaller fine grained clusters can be picked out by their boundaries. Much like the labels the default is to have these cluster boundaries drawn in colour, with the colour designed to match with the cluster colour. To see this lets add the cluster boundaries (and increase the line-width to make them more visible):
[5]:
plot = datamapplot.create_interactive_plot(
cord19_data_map,
cord19_labels,
initial_zoom_fraction=0.33,
cluster_boundary_polygons=True,
cluster_boundary_line_width=8,
)
plot
[5]:
Similarly to the labels we can turn this effect off by setting color_cluster_boundaries
to False
:
[6]:
plot = datamapplot.create_interactive_plot(
cord19_data_map,
cord19_labels,
initial_zoom_fraction=0.33,
cluster_boundary_polygons=True,
cluster_boundary_line_width=8,
color_label_text=False,
color_cluster_boundaries=False,
)
plot
[6]: