Note
Go to the end to download the full example code.
ArXiv ML Word Cloud Style
Demonstrating the word cloud style using the ArXiv ML dataset.

0%| | 0/500 [00:00<?, ?it/s]
1%| | 4/500 [00:00<00:13, 36.88it/s]
2%|▏ | 8/500 [00:00<00:13, 37.71it/s]
3%|▎ | 13/500 [00:00<00:12, 39.99it/s]
4%|▎ | 18/500 [00:00<00:11, 41.58it/s]
5%|▍ | 23/500 [00:00<00:11, 42.94it/s]
6%|▌ | 29/500 [00:00<00:10, 46.10it/s]
7%|▋ | 35/500 [00:00<00:09, 50.19it/s]
8%|▊ | 42/500 [00:00<00:08, 54.62it/s]
10%|▉ | 49/500 [00:00<00:07, 58.27it/s]
11%|█▏ | 57/500 [00:01<00:07, 62.08it/s]
13%|█▎ | 64/500 [00:01<00:06, 64.19it/s]
15%|█▍ | 73/500 [00:01<00:05, 71.56it/s]
17%|█▋ | 85/500 [00:01<00:04, 84.53it/s]
20%|█▉ | 98/500 [00:01<00:04, 96.77it/s]
24%|██▍ | 119/500 [00:01<00:02, 129.50it/s]
28%|██▊ | 141/500 [00:01<00:02, 154.08it/s]
32%|███▏ | 162/500 [00:01<00:01, 170.38it/s]
37%|███▋ | 183/500 [00:01<00:01, 180.61it/s]
41%|████ | 204/500 [00:02<00:01, 188.00it/s]
45%|████▌ | 225/500 [00:02<00:01, 193.09it/s]
49%|████▉ | 246/500 [00:02<00:01, 197.08it/s]
53%|█████▎ | 267/500 [00:02<00:01, 200.26it/s]
58%|█████▊ | 288/500 [00:02<00:01, 202.78it/s]
62%|██████▏ | 309/500 [00:02<00:00, 204.30it/s]
66%|██████▌ | 330/500 [00:02<00:00, 204.95it/s]
70%|███████ | 351/500 [00:02<00:00, 205.19it/s]
74%|███████▍ | 372/500 [00:02<00:00, 205.28it/s]
79%|███████▉ | 394/500 [00:02<00:00, 206.92it/s]
83%|████████▎ | 415/500 [00:03<00:00, 206.88it/s]
87%|████████▋ | 436/500 [00:03<00:00, 207.44it/s]
91%|█████████▏| 457/500 [00:03<00:00, 207.55it/s]
96%|█████████▌| 478/500 [00:03<00:00, 206.53it/s]
100%|█████████▉| 499/500 [00:03<00:00, 206.83it/s]
100%|██████████| 500/500 [00:03<00:00, 144.97it/s]
Resetting positions to accord with alignment
import datamapplot
import numpy as np
import requests
import PIL
import matplotlib.pyplot as plt
import colorcet
plt.rcParams["savefig.bbox"] = "tight"
arxivml_data_map = np.load("arxiv_ml_data_map.npz")["arr_0"]
arxivml_labels = np.load("arxiv_ml_cluster_labels.npz", allow_pickle=True)["arr_0"]
arxiv_logo_response = requests.get(
"https://upload.wikimedia.org/wikipedia/commons/7/7a/ArXiv_logo_2022.png",
stream=True,
headers={"User-Agent": "My User Agent 1.0"},
)
arxiv_logo = np.asarray(PIL.Image.open(arxiv_logo_response.raw).convert("RGBA"))
fig, ax = datamapplot.create_plot(
arxivml_data_map,
arxivml_labels,
title="ArXiv ML Landscape",
sub_title="A data map of papers from the Machine Learning section of ArXiv",
label_wrap_width=10,
label_over_points=True,
dynamic_label_size=True,
max_font_size=36,
min_font_size=4,
min_font_weight=100,
max_font_weight=1000,
font_family="Roboto Condensed",
cmap=colorcet.cm.CET_C2,
logo=arxiv_logo,
edge_bundle=True,
add_glow=False,
edge_bundle_keywords={"n_neighbors": 15, "color_map_nn": 15},
darkmode=True,
logo_width=0.1,
)
fig.savefig("plot_arxiv_ml_edge_bundle.png", bbox_inches="tight")
plt.show()
Total running time of the script: (1 minutes 8.248 seconds)