fix(script): Use new var names for working paper

This commit is contained in:
Marty Oehme 2024-07-15 19:56:36 +02:00
parent 4e4f75ff7a
commit 4012ea55f0
Signed by: Marty
GPG key ID: EDBF2ED917B2EF6A

View file

@ -21,7 +21,12 @@ crossref:
latex-list-of-description: Appendix Table
---
{{< include 01-codechunks/_prep-data.qmd >}}
```{python}
#| label: prep-data
#| echo: false
#| output: false
{{< include 01-codechunks/_prep-data.py >}}
```
<!-- pagebreak to separate from TOC -->
{{< pagebreak >}}
@ -281,8 +286,6 @@ To identify potential studies and create an initial sample, relevant terms for t
Identified terms comprising the world of work can be found in @tbl-wow-terms,
with the search query requiring a term from the general column and one other column.
::: {#tbl-wow-terms}
```{python}
#| label: tbl-wow-terms
#| tbl-cap: World of work term cluster
@ -290,10 +293,6 @@ terms_wow = pd.read_csv("02-data/supplementary/terms_wow.csv")
md(tabulate(terms_wow.fillna(""), showindex=False, headers="keys", tablefmt="grid"))
```
World of work term cluster
:::
The world of work cluster, like the inequality and policy intervention clusters below, is made up of a general signifier (such as "work", "inequality" or "intervention") which has to be labelled in a study to form part of the sample,
as well as any additional terms looking into one or multiple specific dimensions or categories of these signifiers (such as "domestic" work, "gender" inequality, "maternity leave" intervention).
At least one general term and at least one additional term have to be mentioned by a study to be identified for the initial sample pool.
@ -303,21 +302,15 @@ Where terms have been identified from previous reviews outside the introduced IL
their sources have been included in the table.
For the database query, a single term from the general category is required to be included in addition to one term from *any* of the remaining categories.
::: {#tbl-intervention-terms}
```{python}
#| label: tbl-intervention-terms
#| tbl-cap: Intervention term cluster
#| tbl-cap: Policy intervention term cluster
terms_policy = pd.read_csv("02-data/supplementary/terms_policy.csv")
# different headers to include 'social norms'
headers = ["General", "Institutional", "Structural", "Agency & social norms"]
md(tabulate(terms_policy.fillna(""), showindex=False, headers=headers, tablefmt="grid"))
```
Policy intervention term cluster
:::
Lastly, the inequality cluster is once again made up of a general term describing inequality which has to form part of the query results, as well as at least one term describing a specific vertical or horizontal inequality,
as seen in @tbl-inequality-terms.
@ -346,7 +339,7 @@ with a focus on the narrowing criteria specified in @tbl-inclusion-criteria.
::: {#tbl-inclusion-criteria}
```{python}
#| label: tbl-inclusion-criteria
#| label: inclusion-criteria
inclusion_criteria = pd.read_csv("02-data/supplementary/inclusion-criteria.tsv", sep="\t")
md(tabulate(inclusion_criteria, showindex=False, headers="keys", tablefmt="grid"))
@ -407,7 +400,7 @@ as can be seen in @fig-publications-per-year.
#| fig-cap: Publications per year
df_study_years = (
bib_df.groupby(["author", "year", "title"])
df.groupby(["author", "year", "title"])
.first()
.reset_index()
.drop_duplicates()
@ -442,8 +435,8 @@ First, in general, citation counts are slightly decreasing --- as should general
```{python}
#| label: fig-citations-per-year-avg
#| fig-cap: Average citations per year
bib_df["zot_cited"] = bib_df["zot_cited"].dropna().astype("int")
grpd = bib_df.groupby(["year"], as_index=False)["zot_cited"].mean()
df["zot_cited"] = df["zot_cited"].dropna().astype("int")
grpd = df.groupby(["year"], as_index=False)["zot_cited"].mean()
fig, ax = plt.subplots()
ax.bar(grpd["year"], grpd["zot_cited"])
sns.regplot(x=grpd["year"], y=grpd["zot_cited"], ax=ax)
@ -484,7 +477,7 @@ analysing the main findings per policy area, as well as underscore individual st
#| fig-cap: Available studies by primary type of intervention
by_intervention = (
bib_df
df
.fillna("")
.groupby(["author", "year", "title", "design", "method", "representativeness", "citation"])
.agg(
@ -1112,7 +1105,7 @@ The authors suggest the primary channel is the newly increased bargaining power
#| label: prep-inequalities-crosstabs
# dataframe containing each intervention inequality pair
df_inequality = (
bib_df[["region", "intervention", "inequality"]]
df[["region", "intervention", "inequality"]]
.assign(
Intervention = lambda _df: (_df["intervention"]
.str.replace(r"\(.+\)", "", regex=True)
@ -1149,7 +1142,7 @@ in which fewer studies have been identified.
#| fig-cap: Studies by regions analysed
by_region = (
bib_df[["region"]]
df[["region"]]
.assign(
region = lambda _df: (_df["region"]
.str.replace(r" ?; ?", ";", regex=True)
@ -1278,8 +1271,8 @@ many studies use income measurements and changes in income or income inequality
```{python}
#| label: inequality-targeting-implicit-explicit
targeting_majority = bib_df["targeting"].value_counts().index.tolist()[0]
targeting_minority = bib_df["targeting"].value_counts().index.tolist()[-1]
targeting_majority = df["targeting"].value_counts().index.tolist()[0]
targeting_minority = df["targeting"].value_counts().index.tolist()[-1]
```
Often, however, income inequality is not the primary inequality being targeted, but used to measure the effects on other inequalities by seeing how the effects of respective inequality and income intersect, as will be discussed in the following section.
@ -1291,7 +1284,7 @@ with only a minority of studies looking at policies with an `{python} targeting_
#| fig-cap: Types of inequality analysed
by_inequality = (
bib_df[["inequality"]]
df[["inequality"]]
.assign(
inequality = lambda _df: (_df["inequality"]
.str.replace(r"\(.+\)", "", regex=True)
@ -1385,7 +1378,7 @@ with the other regions trailing further behind in output.
#| fig-cap: Regional distribution of studies analysing gender inequalities
by_region_and_inequality = (
bib_df[["inequality", "region"]]
df[["inequality", "region"]]
.assign(
region = lambda _df: (_df["region"]
.str.replace(r" ?; ?", ";", regex=True)
@ -1755,7 +1748,7 @@ Internal validity ranking. Adapted from @Maitrot2017.
```{python}
#| label: apptbl-extraction-matrix
#| tbl-cap: Extraction matrix
bib_df
df
```
{{< pagebreak >}}