It is primarily a reproduction of key plots from the original report. Additionally, it serves as a exercise in plotting with the python library seaborn and the underlying matplotlib. Lastly, it approaches some less well tread territory for data science in the python universe as it uses the python library polars-rs for data loading and transformation. All the code used to transform the data and create the plots is available directly within the full text document, and separately as well. PDF and Docx formats are available with the plotting results only. Their original purpose was the collection of a long list of all the nuclear explosions occurring between those years, as well as analysing the responsible nations, tracking the types and purposes of the explosions, as well as connecting the rise and fall of nuclear explosion numbers to historical events throughout. ## Total numbers ::: {.callout-note} ## Nuclear devices There are two main kinds of nuclear device: those based entirely, on fission, or the splitting of heavy atomic nucleii (previously known as atomic devices) and those in which the main energy is obtained by means of fusion, or of -light atomic nucleii (hydrogen or thermonuclear devices). A fusion explosion must however be initiated with the help of a fission device. The strength of a fusion explosion can be practically unlimited. The explosive power of a nuclear explosion is expressed in ktlotons, (kt) or megatons (Mt), which correspond to 1000 and i million'tonnes, of conventional explosive (TNT), respectively. [@Bergkvist2000, 6] ::: We begin by investigating a table containing all the absolute counts and yields each country had explode, seen in @tbl-yields. ```{python} # | label: tbl-yields # | tbl-cap: "Total number and yields of explosions" from great_tables import GT, md df_yields = ( df.select(["country", "id_no", "yield_lower", "yield_upper"]) .with_columns(yield_avg=pl.mean_horizontal(pl.col(["yield_lower", "yield_upper"]))) .group_by("country") .agg( pl.col("id_no").len().alias("count"), pl.col("yield_avg").sum(), ) # .with_columns(country=pl.col("country").cast(pl.String).str.to_titlecase()) .sort("count", descending=True) ) ( GT(df_yields) .tab_source_note( source_note="Source: Author's elaboration based on Bergkvist and Ferm (2000)." ) .tab_spanner(label="Totals", columns=["count", "yield_avg"]) .tab_stub(rowname_col="country") .tab_stubhead(label="Country") .cols_label( count="Count", yield_avg="Yield in kt", ) .fmt_integer(columns="count") .fmt_number(columns="yield_avg", decimals=1) ) ``` ## Numbers over time When investigating the nuclear explosions in the world, let us first start by looking at how many explosions occurred each year in total. This hides the specific details of who was responsible and which types were involved but instead paints a much stronger picture of the overall dimension of nuclear testing, as can be seen in @fig-total. ```{python} # | label: fig-total # | fig-cap: "Total Nuclear explosions, 1945-98" per_year = df.group_by(pl.col("year")).agg(pl.len()).sort("year") with sns.axes_style( "darkgrid", {"xtick.bottom": True, "ytick.left": True} ): g = sns.barplot(data=per_year, x="year", y="len", order=range(1945, 1999), width=1) g.set_xlabel("Year") g.set_ylabel("Count") plt.setp( g.get_xticklabels(), rotation=90, ha="right", va="center", rotation_mode="anchor", ) # ensure rotated right-anchor g.set_xticks(g.get_xticks(), minor=True) # enable minor ticks every entry g.set_xticks(g.get_xticks()[::2]) # enable major ticks every 2nd entry plt.show() del per_year ``` As we can see, the numbers of explosions rise increasingly towards 1957 and sharply until 1958, before dropping off for a year in 1959. The reasons for this drop are not entirely clear, but it is very likely that the data are simply missing for these years. There is another, very steep, rise in 1962 with over 175 recorded explosions, before an even sharper drop-off the following year down to just 50 explosions. Afterwards the changes appear less sharp and the changes remain between 77 and 24 explosions per year, with a slight downward tendency. While these numbers show the overall proliferation of nuclear power, let us now instead turn to the contributions by individual countries. A split in the number of explosions over time by country can be seen in @fig-percountry. ```{python} # | label: fig-percountry # | fig-cap: "Nuclear explosions by country, 1945-98" keys = df.select("date").unique().join(df.select("country").unique(), how="cross") per_country = keys.join( df.group_by(["date", "country"], maintain_order=True).len(), on=["date", "country"], how="left", coalesce=True, ).with_columns(pl.col("len").fill_null(0)) g = sns.lineplot(data=per_country, x="date", y="len", hue="country") g.set_xlabel("Year") g.set_ylabel("Count") plt.setp( g.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor" ) # ensure rotated right-anchor plt.show() del per_country ``` Once again we can see the visibly steep ramp-up to 1962, though it becomes clear that this was driven both by the USSR and the US. Of course the graph also makes visible the sheer unmatched number of explosions emenating from both of the countries, with only France catching up to the US numbers and China ultimately overtaking them in the 1990s. However, here it also becomes more clear how the UK was responsible for some early explosions in the late 1950s and early 1960s already, as well as the rise in France's nuclear testing from the early 1960s onwards to around 1980, before slowly decreasing in intensity afterwards. Let us turn to a cross-cut through the explosions in @fig-groundlevel, focusing on the number of explosions that have occurred underground and above-ground respectively.[^aboveground] [^aboveground]: Detonations counted as above ground are made up of atmospheric, airdrop, tower, balloon, barge or ship, rocket and water surface detonations. Any other detonation is counted as below ground, primarily taking place in tunnels, shafts and galleries. ```{python} # | label: fig-groundlevel # | fig-cap: "Nuclear explosions above and below ground, 1945-98" from polars import Boolean above_cat = pl.Series( [ "ATMOSPH", "AIRDROP", "TOWER", "BALLOON", "SURFACE", "BARGE", "ROCKET", "SPACE", "SHIP", "WATERSUR", "WATER SU", ] ) df_groundlevel = ( df.with_columns( above_ground=pl.col("type").map_elements( lambda x: True if x in above_cat else False, return_dtype=Boolean ) ) .group_by(pl.col("date", "country", "above_ground")) .agg(count=pl.len()) .sort("date") ) with sns.axes_style("darkgrid", {"xtick.bottom": True, "ytick.left": True}): for above_ground in [True, False]: g = sns.histplot( data=df_groundlevel.filter( pl.col("above_ground") == above_ground ).with_columns( count=pl.col("count") * (1 if above_ground else -1), ), x="date", weights="count", hue="country", multiple="stack", binwidth=365, ) g.xaxis.set_major_locator(mdates.YearLocator(base=5)) g.xaxis.set_minor_locator(mdates.YearLocator()) plt.setp( g.get_xticklabels(), rotation=90, ha="right", va="top", rotation_mode="anchor" ) # FIXME get dynamic range for yticks instead of hardcoding g.set_yticks(np.arange(-130, 140, 20)) g.set_yticks(np.arange(-130, 140, 10), minor=True) plt.show() del df_groundlevel ``` This plot paints a different picture yet again: while overall the number of explosions still rise and fall with some early sharp spikes, we can see a clear shift from above-ground to underground tests, starting with the year 1962. ## Locations Finally, let's view a map of the world with the explosions marked. ```{python} # | label: fig-worldmap # | fig-cap: "World map of nuclear explosions, 1945-98" import folium import geopandas as gpd from shapely.geometry import Point def set_style() -> pl.Expr: return ( pl.when(pl.col("country") == "USSR") .then(pl.lit({"color": "red"}, allow_object=True)) .otherwise(pl.lit({"color": "blue"}, allow_object=True)) ) geom = [Point(xy) for xy in zip(df["longitude"], df["latitude"])] # df_pd = df.with_columns(style=set_style()).to_pandas().set_index("date") df_pd = df.with_columns().to_pandas().set_index("date") gdf = gpd.GeoDataFrame( df_pd, crs="EPSG:4326", geometry=gpd.points_from_xy(x=df_pd["longitude"], y=df_pd["latitude"]), ) del df_pd country_colors = { "USA": "darkblue", "USSR": "darkred", "FRANCE": "pink", "UK": "black", "CHINA": "purple", "INDIA": "orange", "PAKIST": "green", } m = folium.Map(tiles="cartodb positron") for country in country_colors.keys(): fg = folium.FeatureGroup(name=country, show=True).add_to(m) folium.GeoJson( gdf[gdf["country"].str.contains(country)], name="Nuclear Explosions", marker=folium.Circle(radius=3, fill_opacity=0.4), style_function=lambda x: { "color": country_colors[x["properties"]["country"]], "radius": ( x["properties"]["magnitude_body"] if x["properties"]["magnitude_body"] > 0 else 1.0 ) * 10, }, tooltip=folium.GeoJsonTooltip(fields=["year", "country", "type"]), highlight_function=lambda x: {"fillOpacity": 0.8}, popup=folium.GeoJsonPopup( fields=[ "year", "country", "region", "source", "latitude", "longitude", "magnitude_body", "magnitude_surface", "depth", "yield_lower", "yield_upper", "purpose", "name", "type", ] ), ).add_to(fg) folium.LayerControl().add_to(m) m ``` That is all for now. There are undoubtedly more explorations to undertake, but this is it for the time being.