feat(script): Add validity discussion

This commit is contained in:
Marty Oehme 2024-02-21 11:31:39 +01:00
parent cc9749a011
commit 4901baa5e5
Signed by: Marty
GPG key ID: EDBF2ED917B2EF6A

View file

@ -1263,21 +1263,83 @@ One reason for such a differentiation could be a larger amount of gray literatur
which may be utilising less established terms than the majority of captured literature for policy implementations.
Another reason could be the actual implementation of different policy programmes which are then equally not captured by existing term clusters.
### Internal and external validity
Using the validity ranking separated into internal and external validity for each study,
it is possible to identify the general make-up of the overall sample,
the relationship between both dimensions and the distribution of studies within.
As can be seen in @fig-validity-relation, the relationship between the internal dimension and the external dimension of validity for the study pool follows a normal distribution.
Generally, studies that have a lower internal validity, between 2.0 and 3.5, rank higher on their external validity,
while studies with higher internal validity in turn do not reach as high on the external validity ranking.
::: {layout-ncol=2 .column-body-outset}
```{python}
#| label: fig-validity
#| label: fig-validity-relation
#| fig-cap: "Relation between internal and external validity"
from src.model import validity
validities = validity.calculate(by_intervention)
validities["identifier"] = validities["author"].str.replace(r',.*$', '', regex=True) + " (" + validities["year"].astype(str) + ")"
validities = validities.loc[(validities["design"] == "quasi-experimental") | (validities["design"] == "experimental")]
#validities["external_validity"] = validities["external_validity"].astype('category')
validities["internal_validity"] = validities["internal_validity"].astype('category')
g = sns.PairGrid(validities[["internal_validity", "external_validity", "identifier"]].drop_duplicates(subset="identifier"),
x_vars=["internal_validity", "external_validity"], y_vars = ["identifier"]
plt.figure().set_figheight(5)
sns.violinplot(
data=validities,
x="internal_validity", y="external_validity", hue="design",
cut=0, bw_method="scott",
orient="x"
)
sns.swarmplot(
data=validities,
x="internal_validity", y="external_validity", legend=False,
color="darkmagenta",
s=4
)
# Create a stacked histplot using Seaborn
#sns.scatterplot(data=validities, x='external_validity', y='internal_validity', hue='intervention')
```
```{python}
#| label: fig-validity-distribution
#| fig-cap: "Distribution of internal validities"
sns.displot(
data=validities,
x="external_validity", hue="internal_validity",
kind="kde",
multiple="fill", clip=(0, None),
palette="ch:rot=-0.5,hue=1.5,light=0.9",
bw_adjust=.65, cut=0,
warn_singular = False
)
```
:::
Studies with an internal validity ranking of of 3.0 (primarily made up of difference-in-difference approaches) and an internal ranking of 5.0 (randomized control trials) have the same tight clustering around an external validity between 4.0 (national) and 5.0 (census-based), and 2.0 (local) and 3.0 (subnational), respectively.
This clearly shows the expected overall relationship of studies with high internal validity generally ranking lower on their external validity.
The situation is less clear-cut with the internal rankings of 2.0 (primarily ordinary least squares) and 4.0 (primarily instrumental variable),
which show a larger external validity spread.
For 2.0-ranked studies, there is an overall larger spread with most using nationally representative data,
while a significant amount makes use of census-based data and others in turn only being subnationally representative.
Studies ranked 4.0 internally have a higher heterogeneity with the significant outlier of @Thoresen2021,
which had the limitation of its underlying data being non-representative.
Looking at the overall density of studies along their external validity dimension,
@fig-validity-distribution reiterates this overall relationship with internal validity.
It additionally shows that studies with low internal validity make up the dominant number of nationally representative analyses and the slight majority of census-based analyses,
while locally or non-representative samples are almost solely made up of internally highly valid (ranking 4.0 or above) analyses,
again with the exception of @Thoresen2021 already mentioned.
Looking at the data per region, census-based studies are primarily spread between Latin America and the Caribbean, as well as Europe and Central Asia.
Meanwhile, studies using nationally, subnationally or non-representative data then to have a larger focus on North America, as well as East Asia and the Pacific.
A slight trend towards studies focusing on evidence-based research in developing countries is visible, though with an overall rising output, as seen in @fig-publications-per-year,
and the possibly a reliance on more recent datasets, this would be expected.
### Inequality types analysed
Policy interventions undertaken either with the explicit aim of reducing one or multiple inequalities, or analysed under the lens of such an aim implicitly, appear in a wide array of variations to their approach and primary targeted inequality, as was highlighted in the previous section.