fix(script): Split validity figures in two

Since otherwise the docx file did not contain correct representations
of the discussion section validity robustness, I split them in two
separate figures (no sub-figures)instead. Now they are like any
other figure.
This commit is contained in:
Marty Oehme 2024-02-28 10:22:21 +01:00
parent 1f61cf8098
commit 6e94f60ea6
Signed by: Marty
GPG key ID: EDBF2ED917B2EF6A

View file

@ -1285,12 +1285,11 @@ Using the validity ranking separated into internal and external validity for eac
it is possible to identify the general make-up of the overall sample, it is possible to identify the general make-up of the overall sample,
the relationship between both dimensions and the distribution of studies within. the relationship between both dimensions and the distribution of studies within.
As can be seen in @fig-validity-relation, the relationship between the internal dimension and the external dimension of validity for the study pool follows a normal distribution. @fig-validity-relation shows the relation between each study's validity on the internal dimension and the external dimension,
with experimental studies additionally distinguished.
Generally, studies that have a lower internal validity, between 2.0 and 3.5, rank higher on their external validity, Generally, studies that have a lower internal validity, between 2.0 and 3.5, rank higher on their external validity,
while studies with higher internal validity in turn do not reach as high on the external validity ranking. while studies with higher internal validity in turn do not reach as high on the external validity ranking.
::: {layout-ncol=2 .column-body-outset}
```{python} ```{python}
#| label: fig-validity-relation #| label: fig-validity-relation
#| fig-cap: "Relation between internal and external validity" #| fig-cap: "Relation between internal and external validity"
@ -1318,23 +1317,6 @@ sns.swarmplot(
) )
``` ```
```{python}
#| label: fig-validity-distribution
#| fig-cap: "Distribution of internal validities"
sns.displot(
data=validities,
x="external_validity", hue="internal_validity",
kind="kde",
multiple="fill", clip=(0, None),
palette="ch:rot=-0.5,hue=1.5,light=0.9",
bw_adjust=.65, cut=0,
warn_singular = False
)
```
:::
Studies with an internal validity ranking of of 3.0 (primarily made up of difference-in-difference approaches) and an internal ranking of 5.0 (randomized control trials) have the same tight clustering around an external validity between 4.0 (national) and 5.0 (census-based), and 2.0 (local) and 3.0 (subnational), respectively. Studies with an internal validity ranking of of 3.0 (primarily made up of difference-in-difference approaches) and an internal ranking of 5.0 (randomized control trials) have the same tight clustering around an external validity between 4.0 (national) and 5.0 (census-based), and 2.0 (local) and 3.0 (subnational), respectively.
This clearly shows the expected overall relationship of studies with high internal validity generally ranking lower on their external validity. This clearly shows the expected overall relationship of studies with high internal validity generally ranking lower on their external validity.
@ -1351,10 +1333,25 @@ It additionally shows that studies with low internal validity make up the domina
while locally or non-representative samples are almost solely made up of internally highly valid (ranking 4.0 or above) analyses, while locally or non-representative samples are almost solely made up of internally highly valid (ranking 4.0 or above) analyses,
again with the exception of @Thoresen2021 already mentioned. again with the exception of @Thoresen2021 already mentioned.
```{python}
#| label: fig-validity-distribution
#| fig-cap: "Distribution of internal validities"
sns.displot(
data=validities,
x="external_validity", hue="internal_validity",
kind="kde",
multiple="fill", clip=(0, None),
palette="ch:rot=-0.5,hue=1.5,light=0.9",
bw_adjust=.65, cut=0,
warn_singular = False
)
```
Looking at the data per region, census-based studies are primarily spread between Latin America and the Caribbean, as well as Europe and Central Asia. Looking at the data per region, census-based studies are primarily spread between Latin America and the Caribbean, as well as Europe and Central Asia.
Meanwhile, studies using nationally, subnationally or non-representative data then to have a larger focus on North America, as well as East Asia and the Pacific. Meanwhile, studies using nationally, subnationally or non-representative data then to have a larger focus on North America, as well as East Asia and the Pacific.
A slight trend towards studies focusing on evidence-based research in developing countries is visible, though with an overall rising output, as seen in @fig-publications-per-year, A slight trend towards studies focusing on evidence-based research in developing countries is visible,
and the possibly a reliance on more recent datasets, this would be expected. though with an overall rising output as could be seen in @fig-publications-per-year, and the possibly a reliance on more recent datasets, this would be expected.
### Inequality types analysed ### Inequality types analysed