Add meta observations
This commit is contained in:
parent
bff3cb22fa
commit
910e72c0e2
1 changed files with 255 additions and 5 deletions
260
meta.md
260
meta.md
|
@ -1,7 +1,27 @@
|
||||||
This page documents some meta observations about my time recreating the nuclear explosions in this post,
|
---
|
||||||
mostly some little tips to work well with python polars and seaborn, or little tricks to integrate them and geopandas visualizations.
|
title: Reproducible blog posts
|
||||||
|
subtitle: "Moving from Quarto manuscript to Astro markdown post output"
|
||||||
|
description: "Moving from Quarto manuscript to Astro markdown post output"
|
||||||
|
pubDate: "2024-07-08T10:06:38"
|
||||||
|
weight: 10
|
||||||
|
tags:
|
||||||
|
- python
|
||||||
|
- astro
|
||||||
|
---
|
||||||
|
|
||||||
## From a lat/long polars dataframe to geopandas
|
This page documents some meta observations about my time recreating the nuclear
|
||||||
|
explosions in the last post.
|
||||||
|
|
||||||
|
It goes over some little tips to work well with python polars and seaborn,
|
||||||
|
as well as little tricks to integrate them and geopandas visualizations.
|
||||||
|
|
||||||
|
Finally, I make some observations on the actual process of transferring this
|
||||||
|
produced output into a blog post on my website, written in the Astro static
|
||||||
|
site builder framework.
|
||||||
|
|
||||||
|
## Data modelling and visualization
|
||||||
|
|
||||||
|
### From a lat/long polars dataframe to geopandas
|
||||||
|
|
||||||
To go from a polars frame to one we can use for GIS operations with geopandas is fairly simple:
|
To go from a polars frame to one we can use for GIS operations with geopandas is fairly simple:
|
||||||
We first move from a polars to an indexed pandas frame, in this case I have indexed on the date of each explosion.
|
We first move from a polars to an indexed pandas frame, in this case I have indexed on the date of each explosion.
|
||||||
|
@ -20,7 +40,7 @@ gdf = gpd.GeoDataFrame(
|
||||||
del df_pd
|
del df_pd
|
||||||
```
|
```
|
||||||
|
|
||||||
## Keeping the same seaborn color palette for the same categories
|
### Keeping the same seaborn color palette for the same categories
|
||||||
|
|
||||||
For the analysis, I have multiple plots which distinguish between the different countries undertaking nuclear detonations.
|
For the analysis, I have multiple plots which distinguish between the different countries undertaking nuclear detonations.
|
||||||
The country category thus appears repeatedly, and with static values (i.e. it will always contain 'US', 'USSR', 'China', 'France' and so on).
|
The country category thus appears repeatedly, and with static values (i.e. it will always contain 'US', 'USSR', 'China', 'France' and so on).
|
||||||
|
@ -101,7 +121,7 @@ folium.GeoJson(
|
||||||
).add_to(map)
|
).add_to(map)
|
||||||
```
|
```
|
||||||
|
|
||||||
## Using dictionary keys to create folium map layers
|
### Using dictionary keys to create folium map layers
|
||||||
|
|
||||||
As a bonus we can even use our color category keys to create different layers on the folium map which can be turned on and off individually.
|
As a bonus we can even use our color category keys to create different layers on the folium map which can be turned on and off individually.
|
||||||
Thus we can decide which country's detonations we want to visualize.
|
Thus we can decide which country's detonations we want to visualize.
|
||||||
|
@ -134,6 +154,235 @@ for country in country_colors.keys():
|
||||||
folium.LayerControl().add_to(m)
|
folium.LayerControl().add_to(m)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Output wrangling
|
||||||
|
|
||||||
|
### Multiple project profiles
|
||||||
|
|
||||||
|
During development and analysis I have only had a single project which then in turn targeted two formats:
|
||||||
|
`html` for previews and dynamic elements and `pdf` (in truth the new `typst`) for checking static elements.
|
||||||
|
The following `_quarto.yml` file describes a full working project:
|
||||||
|
|
||||||
|
```yml
|
||||||
|
author: Marty Oehme
|
||||||
|
csl: https://www.zotero.org/styles/apa
|
||||||
|
|
||||||
|
project:
|
||||||
|
type: default
|
||||||
|
output-dir: output
|
||||||
|
render:
|
||||||
|
- index.qmd
|
||||||
|
- meta.md
|
||||||
|
|
||||||
|
format:
|
||||||
|
html:
|
||||||
|
code-fold: true
|
||||||
|
toc: true
|
||||||
|
echo: true
|
||||||
|
typst:
|
||||||
|
toc: true
|
||||||
|
echo: false
|
||||||
|
citeproc: true
|
||||||
|
docx:
|
||||||
|
toc: true
|
||||||
|
echo: false
|
||||||
|
```
|
||||||
|
|
||||||
|
This works well for single-'target' deployments which may arrive in multiple formats but are fundamentally the same.
|
||||||
|
What happens, however, if we target something completely different (in my case this Astro blog)
|
||||||
|
which may not even reside in the same directory?
|
||||||
|
|
||||||
|
We can create what quarto calls 'project profiles', simply by creating additional `_quart-mypofile.yml` files in the project root.
|
||||||
|
They will Grab all the yaml data from the original `_quarto.yml` file and then add and overwrite it with their own file's data to create the overall profile.
|
||||||
|
|
||||||
|
So if we have the following two files:
|
||||||
|
|
||||||
|
```yml
|
||||||
|
# _quarto.yml
|
||||||
|
author: Marty Oehme
|
||||||
|
csl: https://www.zotero.org/styles/apa
|
||||||
|
```
|
||||||
|
|
||||||
|
```yml
|
||||||
|
# _quarto-local.yml
|
||||||
|
project:
|
||||||
|
type: default
|
||||||
|
output-dir: output
|
||||||
|
render:
|
||||||
|
- index.qmd
|
||||||
|
- meta.md
|
||||||
|
|
||||||
|
format:
|
||||||
|
html:
|
||||||
|
code-fold: true
|
||||||
|
toc: true
|
||||||
|
echo: true
|
||||||
|
typst:
|
||||||
|
toc: true
|
||||||
|
echo: false
|
||||||
|
citeproc: true
|
||||||
|
docx:
|
||||||
|
toc: true
|
||||||
|
echo: false
|
||||||
|
```
|
||||||
|
|
||||||
|
We have essentially recreated the above project, only as a project 'profile' to be invoked as `quarto render --profile local`.
|
||||||
|
|
||||||
|
Now, however, we can add a second `_quarto-remote.yml` profile:
|
||||||
|
|
||||||
|
```yml
|
||||||
|
# _quarto-remote.yml
|
||||||
|
project:
|
||||||
|
type: default
|
||||||
|
output-dir: some/remote/directory/maybe/even/over/nfs/or/sshfs
|
||||||
|
render:
|
||||||
|
- index.qmd
|
||||||
|
|
||||||
|
format:
|
||||||
|
hugo-md:
|
||||||
|
preserve-yaml: true
|
||||||
|
code-fold: true
|
||||||
|
keep-ipynb: true
|
||||||
|
wrap: none
|
||||||
|
typst:
|
||||||
|
toc: true
|
||||||
|
echo: false
|
||||||
|
citeproc: true
|
||||||
|
```
|
||||||
|
|
||||||
|
If we invoke this profile with `quarto render --profile remote` we output to a different directory altogether,
|
||||||
|
and have completely different render targets than in the local profile
|
||||||
|
(in this case the same `typst` format and the new `hugo-md` format, while not rendering to `docx` at all).
|
||||||
|
|
||||||
|
This way we can separate different deployments beyond just carrying different formats by actually extending
|
||||||
|
and overwriting all kinds of project options.[^projtypes]
|
||||||
|
|
||||||
|
[^projtypes]: It would for example even be conceivable to have one project profile targeting a locally output `book` project type while a second targets the deployment of a remote `website` type from the same source material.
|
||||||
|
|
||||||
|
If we don't invoke the profile we don't have explicit render or format targets and do not set an output dir.
|
||||||
|
However, we also have a way to set a 'default' project profile (for which we don't have to enter the option each time).
|
||||||
|
|
||||||
|
We can do so by slightly extending the base `_quarto.yml` file.
|
||||||
|
|
||||||
|
```yml
|
||||||
|
# _quarto.yml
|
||||||
|
author: Marty Oehme
|
||||||
|
csl: https://www.zotero.org/styles/apa
|
||||||
|
|
||||||
|
profile:
|
||||||
|
group:
|
||||||
|
- [local, remote]
|
||||||
|
```
|
||||||
|
|
||||||
|
The two profiles are now in a 'profile group', of which only one can ever be active and of which the
|
||||||
|
first one in the list will automatically be applied when invoking `quarto render` without any additional options.
|
||||||
|
|
||||||
|
This is how I have been doing it for the nuclear analysis: have a local (in my case I simply called it 'default')
|
||||||
|
profile which renders the current project to a local working directory using most of the usual quarto output,
|
||||||
|
such as html preview, and static outputs to double-check how everything is displayed.
|
||||||
|
|
||||||
|
Then, I added another profile on top which I called 'blog' and which outputs its renders directly into the
|
||||||
|
correct post directory of my blog.
|
||||||
|
There are, however, some remaining issues, detailed below.
|
||||||
|
|
||||||
|
### Static content in an Astro blog page
|
||||||
|
|
||||||
|
One issue arises in that quarto has its own way of stowing external fragments (like the PNGs of
|
||||||
|
visualizations) and this often does not automatically work with static site generators which
|
||||||
|
expect static files like this to reside in the 'static' (or 'public') directory instead of
|
||||||
|
next to the manuscript.
|
||||||
|
|
||||||
|
I have overcome this issue with a 'post-script' which runs after the main quarto processing is done,
|
||||||
|
by adding to the relevant _quarto.yml:
|
||||||
|
|
||||||
|
```yml
|
||||||
|
project:
|
||||||
|
type: default
|
||||||
|
output-dir: /path/to/my/blog/post/2024-07-02-directory
|
||||||
|
render:
|
||||||
|
- index.qmd
|
||||||
|
post-render:
|
||||||
|
- tools/move-static-to-blog.py /path/to/my/blog/static/dir
|
||||||
|
```
|
||||||
|
|
||||||
|
This way, we can first create all the necessary outputs in the normal quarto output directory and
|
||||||
|
afterwards have a script which takes the resulting static files and instead moves it to the
|
||||||
|
correct place in the blog's public directory.
|
||||||
|
|
||||||
|
I am not a huge fan of the amount of hard-coding this approach requires but it does seem like the
|
||||||
|
easiest way to just be able to hit render and have working results.
|
||||||
|
|
||||||
|
The following is one example of how to use python to move the required files to a specific static directory:
|
||||||
|
|
||||||
|
```python
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Safeguards to only move when necessary
|
||||||
|
if not os.getenv("QUARTO_PROJECT_RENDER_ALL"):
|
||||||
|
sys.exit(0)
|
||||||
|
q_output_dir = os.getenv("QUARTO_PROJECT_OUTPUT_DIR")
|
||||||
|
if not q_output_dir:
|
||||||
|
print(f"ERROR: Output dir {q_output_dir} given by Quarto *does not exist*.")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
args = sys.argv
|
||||||
|
|
||||||
|
files: list[Path] = []
|
||||||
|
# Get the correct dest and files from args
|
||||||
|
if len(args) < 2:
|
||||||
|
print("Static output file directory for blog post-render is required.")
|
||||||
|
sys.exit(1)
|
||||||
|
else:
|
||||||
|
dest = Path(args[1])
|
||||||
|
if len(args) > 2:
|
||||||
|
for f in args[2:]:
|
||||||
|
files.append(Path(f))
|
||||||
|
|
||||||
|
# Move safeguards
|
||||||
|
if not dest.is_dir():
|
||||||
|
print(f"ERROR: Static output directory {dest} *does not exist*.")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
if not files:
|
||||||
|
for dirname in os.listdir(dest):
|
||||||
|
if dirname.endswith("_files"):
|
||||||
|
dirpath = dest.joinpath(dirname)
|
||||||
|
for root, loc_dirs, loc_files in dirpath.walk():
|
||||||
|
for file in loc_files:
|
||||||
|
files.append(root.joinpath(file))
|
||||||
|
|
||||||
|
for f in files:
|
||||||
|
shutil.copy(f, dest)
|
||||||
|
```
|
||||||
|
|
||||||
|
It simply requires the target directory as the first argument and uses the `QUARTO_PROJECT_OUTPUT_DIR` env var
|
||||||
|
(which Quarto supplies to any post-render script) as the source.
|
||||||
|
Then it copies either all files that have been mentioned as additional arguments (safer) or
|
||||||
|
all files that it finds in directories ending in '_files' (more dangerous).
|
||||||
|
|
||||||
|
Now all additional files reside in the root of the static file dir.
|
||||||
|
If you instead want to 'rebuild' the same structure in the static dir as in the source dir for your assets,
|
||||||
|
you will have to adjust the script to move between the root-relative file paths in the two folders
|
||||||
|
(and autoamtically generate new directories if necessary).
|
||||||
|
|
||||||
|
This should take care of placing static files in the right places.
|
||||||
|
|
||||||
|
### Dynamic content in an Astro blog page
|
||||||
|
Getting the folium/leaflet map to work in a static site generator like [Astro](https://astro.build) was a bit of a pain.
|
||||||
|
Essentially, the concept of getting it to work is the same as for static content above:
|
||||||
|
|
||||||
|
We save the folium-produced html output as a static file and place that in the static file directory of the blog.
|
||||||
|
Then we integrate it into the page with an iframe html element.
|
||||||
|
|
||||||
|
However, some issues arise in producing the static html file in the first place.
|
||||||
|
<!-- TODO: Expand on which issues:
|
||||||
|
Displaying image elements.
|
||||||
|
Manually saving to html.
|
||||||
|
Doing both conditionally. -->
|
||||||
|
|
||||||
## Remaining issues
|
## Remaining issues
|
||||||
|
|
||||||
While working with polars is wonderful and seaborn takes a lot of the stress of creating half-way nicely formatted plots out of mind while first creating them,
|
While working with polars is wonderful and seaborn takes a lot of the stress of creating half-way nicely formatted plots out of mind while first creating them,
|
||||||
|
@ -155,3 +404,4 @@ This project made use the fantastic python library [great tables]() which indeed
|
||||||
However, it primarily targets the html format.
|
However, it primarily targets the html format.
|
||||||
Getting this format into shape for quarto to then translate it into the pandoc AST and ultimately whatever format is not pretty.
|
Getting this format into shape for quarto to then translate it into the pandoc AST and ultimately whatever format is not pretty.
|
||||||
For example LaTeX routinely just crashes instead of rendering the table correctly into a PDF file.
|
For example LaTeX routinely just crashes instead of rendering the table correctly into a PDF file.
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue