Add meta observations
This commit is contained in:
parent
bff3cb22fa
commit
910e72c0e2
1 changed files with 255 additions and 5 deletions
260
meta.md
260
meta.md
|
@ -1,7 +1,27 @@
|
|||
This page documents some meta observations about my time recreating the nuclear explosions in this post,
|
||||
mostly some little tips to work well with python polars and seaborn, or little tricks to integrate them and geopandas visualizations.
|
||||
---
|
||||
title: Reproducible blog posts
|
||||
subtitle: "Moving from Quarto manuscript to Astro markdown post output"
|
||||
description: "Moving from Quarto manuscript to Astro markdown post output"
|
||||
pubDate: "2024-07-08T10:06:38"
|
||||
weight: 10
|
||||
tags:
|
||||
- python
|
||||
- astro
|
||||
---
|
||||
|
||||
## From a lat/long polars dataframe to geopandas
|
||||
This page documents some meta observations about my time recreating the nuclear
|
||||
explosions in the last post.
|
||||
|
||||
It goes over some little tips to work well with python polars and seaborn,
|
||||
as well as little tricks to integrate them and geopandas visualizations.
|
||||
|
||||
Finally, I make some observations on the actual process of transferring this
|
||||
produced output into a blog post on my website, written in the Astro static
|
||||
site builder framework.
|
||||
|
||||
## Data modelling and visualization
|
||||
|
||||
### From a lat/long polars dataframe to geopandas
|
||||
|
||||
To go from a polars frame to one we can use for GIS operations with geopandas is fairly simple:
|
||||
We first move from a polars to an indexed pandas frame, in this case I have indexed on the date of each explosion.
|
||||
|
@ -20,7 +40,7 @@ gdf = gpd.GeoDataFrame(
|
|||
del df_pd
|
||||
```
|
||||
|
||||
## Keeping the same seaborn color palette for the same categories
|
||||
### Keeping the same seaborn color palette for the same categories
|
||||
|
||||
For the analysis, I have multiple plots which distinguish between the different countries undertaking nuclear detonations.
|
||||
The country category thus appears repeatedly, and with static values (i.e. it will always contain 'US', 'USSR', 'China', 'France' and so on).
|
||||
|
@ -101,7 +121,7 @@ folium.GeoJson(
|
|||
).add_to(map)
|
||||
```
|
||||
|
||||
## Using dictionary keys to create folium map layers
|
||||
### Using dictionary keys to create folium map layers
|
||||
|
||||
As a bonus we can even use our color category keys to create different layers on the folium map which can be turned on and off individually.
|
||||
Thus we can decide which country's detonations we want to visualize.
|
||||
|
@ -134,6 +154,235 @@ for country in country_colors.keys():
|
|||
folium.LayerControl().add_to(m)
|
||||
```
|
||||
|
||||
## Output wrangling
|
||||
|
||||
### Multiple project profiles
|
||||
|
||||
During development and analysis I have only had a single project which then in turn targeted two formats:
|
||||
`html` for previews and dynamic elements and `pdf` (in truth the new `typst`) for checking static elements.
|
||||
The following `_quarto.yml` file describes a full working project:
|
||||
|
||||
```yml
|
||||
author: Marty Oehme
|
||||
csl: https://www.zotero.org/styles/apa
|
||||
|
||||
project:
|
||||
type: default
|
||||
output-dir: output
|
||||
render:
|
||||
- index.qmd
|
||||
- meta.md
|
||||
|
||||
format:
|
||||
html:
|
||||
code-fold: true
|
||||
toc: true
|
||||
echo: true
|
||||
typst:
|
||||
toc: true
|
||||
echo: false
|
||||
citeproc: true
|
||||
docx:
|
||||
toc: true
|
||||
echo: false
|
||||
```
|
||||
|
||||
This works well for single-'target' deployments which may arrive in multiple formats but are fundamentally the same.
|
||||
What happens, however, if we target something completely different (in my case this Astro blog)
|
||||
which may not even reside in the same directory?
|
||||
|
||||
We can create what quarto calls 'project profiles', simply by creating additional `_quart-mypofile.yml` files in the project root.
|
||||
They will Grab all the yaml data from the original `_quarto.yml` file and then add and overwrite it with their own file's data to create the overall profile.
|
||||
|
||||
So if we have the following two files:
|
||||
|
||||
```yml
|
||||
# _quarto.yml
|
||||
author: Marty Oehme
|
||||
csl: https://www.zotero.org/styles/apa
|
||||
```
|
||||
|
||||
```yml
|
||||
# _quarto-local.yml
|
||||
project:
|
||||
type: default
|
||||
output-dir: output
|
||||
render:
|
||||
- index.qmd
|
||||
- meta.md
|
||||
|
||||
format:
|
||||
html:
|
||||
code-fold: true
|
||||
toc: true
|
||||
echo: true
|
||||
typst:
|
||||
toc: true
|
||||
echo: false
|
||||
citeproc: true
|
||||
docx:
|
||||
toc: true
|
||||
echo: false
|
||||
```
|
||||
|
||||
We have essentially recreated the above project, only as a project 'profile' to be invoked as `quarto render --profile local`.
|
||||
|
||||
Now, however, we can add a second `_quarto-remote.yml` profile:
|
||||
|
||||
```yml
|
||||
# _quarto-remote.yml
|
||||
project:
|
||||
type: default
|
||||
output-dir: some/remote/directory/maybe/even/over/nfs/or/sshfs
|
||||
render:
|
||||
- index.qmd
|
||||
|
||||
format:
|
||||
hugo-md:
|
||||
preserve-yaml: true
|
||||
code-fold: true
|
||||
keep-ipynb: true
|
||||
wrap: none
|
||||
typst:
|
||||
toc: true
|
||||
echo: false
|
||||
citeproc: true
|
||||
```
|
||||
|
||||
If we invoke this profile with `quarto render --profile remote` we output to a different directory altogether,
|
||||
and have completely different render targets than in the local profile
|
||||
(in this case the same `typst` format and the new `hugo-md` format, while not rendering to `docx` at all).
|
||||
|
||||
This way we can separate different deployments beyond just carrying different formats by actually extending
|
||||
and overwriting all kinds of project options.[^projtypes]
|
||||
|
||||
[^projtypes]: It would for example even be conceivable to have one project profile targeting a locally output `book` project type while a second targets the deployment of a remote `website` type from the same source material.
|
||||
|
||||
If we don't invoke the profile we don't have explicit render or format targets and do not set an output dir.
|
||||
However, we also have a way to set a 'default' project profile (for which we don't have to enter the option each time).
|
||||
|
||||
We can do so by slightly extending the base `_quarto.yml` file.
|
||||
|
||||
```yml
|
||||
# _quarto.yml
|
||||
author: Marty Oehme
|
||||
csl: https://www.zotero.org/styles/apa
|
||||
|
||||
profile:
|
||||
group:
|
||||
- [local, remote]
|
||||
```
|
||||
|
||||
The two profiles are now in a 'profile group', of which only one can ever be active and of which the
|
||||
first one in the list will automatically be applied when invoking `quarto render` without any additional options.
|
||||
|
||||
This is how I have been doing it for the nuclear analysis: have a local (in my case I simply called it 'default')
|
||||
profile which renders the current project to a local working directory using most of the usual quarto output,
|
||||
such as html preview, and static outputs to double-check how everything is displayed.
|
||||
|
||||
Then, I added another profile on top which I called 'blog' and which outputs its renders directly into the
|
||||
correct post directory of my blog.
|
||||
There are, however, some remaining issues, detailed below.
|
||||
|
||||
### Static content in an Astro blog page
|
||||
|
||||
One issue arises in that quarto has its own way of stowing external fragments (like the PNGs of
|
||||
visualizations) and this often does not automatically work with static site generators which
|
||||
expect static files like this to reside in the 'static' (or 'public') directory instead of
|
||||
next to the manuscript.
|
||||
|
||||
I have overcome this issue with a 'post-script' which runs after the main quarto processing is done,
|
||||
by adding to the relevant _quarto.yml:
|
||||
|
||||
```yml
|
||||
project:
|
||||
type: default
|
||||
output-dir: /path/to/my/blog/post/2024-07-02-directory
|
||||
render:
|
||||
- index.qmd
|
||||
post-render:
|
||||
- tools/move-static-to-blog.py /path/to/my/blog/static/dir
|
||||
```
|
||||
|
||||
This way, we can first create all the necessary outputs in the normal quarto output directory and
|
||||
afterwards have a script which takes the resulting static files and instead moves it to the
|
||||
correct place in the blog's public directory.
|
||||
|
||||
I am not a huge fan of the amount of hard-coding this approach requires but it does seem like the
|
||||
easiest way to just be able to hit render and have working results.
|
||||
|
||||
The following is one example of how to use python to move the required files to a specific static directory:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# Safeguards to only move when necessary
|
||||
if not os.getenv("QUARTO_PROJECT_RENDER_ALL"):
|
||||
sys.exit(0)
|
||||
q_output_dir = os.getenv("QUARTO_PROJECT_OUTPUT_DIR")
|
||||
if not q_output_dir:
|
||||
print(f"ERROR: Output dir {q_output_dir} given by Quarto *does not exist*.")
|
||||
sys.exit(1)
|
||||
|
||||
args = sys.argv
|
||||
|
||||
files: list[Path] = []
|
||||
# Get the correct dest and files from args
|
||||
if len(args) < 2:
|
||||
print("Static output file directory for blog post-render is required.")
|
||||
sys.exit(1)
|
||||
else:
|
||||
dest = Path(args[1])
|
||||
if len(args) > 2:
|
||||
for f in args[2:]:
|
||||
files.append(Path(f))
|
||||
|
||||
# Move safeguards
|
||||
if not dest.is_dir():
|
||||
print(f"ERROR: Static output directory {dest} *does not exist*.")
|
||||
sys.exit(1)
|
||||
|
||||
if not files:
|
||||
for dirname in os.listdir(dest):
|
||||
if dirname.endswith("_files"):
|
||||
dirpath = dest.joinpath(dirname)
|
||||
for root, loc_dirs, loc_files in dirpath.walk():
|
||||
for file in loc_files:
|
||||
files.append(root.joinpath(file))
|
||||
|
||||
for f in files:
|
||||
shutil.copy(f, dest)
|
||||
```
|
||||
|
||||
It simply requires the target directory as the first argument and uses the `QUARTO_PROJECT_OUTPUT_DIR` env var
|
||||
(which Quarto supplies to any post-render script) as the source.
|
||||
Then it copies either all files that have been mentioned as additional arguments (safer) or
|
||||
all files that it finds in directories ending in '_files' (more dangerous).
|
||||
|
||||
Now all additional files reside in the root of the static file dir.
|
||||
If you instead want to 'rebuild' the same structure in the static dir as in the source dir for your assets,
|
||||
you will have to adjust the script to move between the root-relative file paths in the two folders
|
||||
(and autoamtically generate new directories if necessary).
|
||||
|
||||
This should take care of placing static files in the right places.
|
||||
|
||||
### Dynamic content in an Astro blog page
|
||||
Getting the folium/leaflet map to work in a static site generator like [Astro](https://astro.build) was a bit of a pain.
|
||||
Essentially, the concept of getting it to work is the same as for static content above:
|
||||
|
||||
We save the folium-produced html output as a static file and place that in the static file directory of the blog.
|
||||
Then we integrate it into the page with an iframe html element.
|
||||
|
||||
However, some issues arise in producing the static html file in the first place.
|
||||
<!-- TODO: Expand on which issues:
|
||||
Displaying image elements.
|
||||
Manually saving to html.
|
||||
Doing both conditionally. -->
|
||||
|
||||
## Remaining issues
|
||||
|
||||
While working with polars is wonderful and seaborn takes a lot of the stress of creating half-way nicely formatted plots out of mind while first creating them,
|
||||
|
@ -155,3 +404,4 @@ This project made use the fantastic python library [great tables]() which indeed
|
|||
However, it primarily targets the html format.
|
||||
Getting this format into shape for quarto to then translate it into the pandoc AST and ultimately whatever format is not pretty.
|
||||
For example LaTeX routinely just crashes instead of rendering the table correctly into a PDF file.
|
||||
|
||||
|
|
Loading…
Reference in a new issue