fix(script): Use new var names for working paper

2024-07-15 19:56:36 +02:00 · 2024-07-15 19:56:36 +02:00 · 4012ea55f0
commit 4012ea55f0
parent 4e4f75ff7a
1 changed files with 19 additions and 26 deletions
--- a/scoping_review.qmd
+++ b/scoping_review.qmd
@ -21,7 +21,12 @@ crossref:
      latex-list-of-description: Appendix Table
 ---
-{{< include 01-codechunks/_prep-data.qmd >}}
+```{python}
 #| label: prep-data
 #| echo: false
 #| output: false
 {{< include 01-codechunks/_prep-data.py >}}
 ```
 <!-- pagebreak to separate from TOC -->
 {{< pagebreak >}}
@ -281,8 +286,6 @@ To identify potential studies and create an initial sample, relevant terms for t
 Identified terms comprising the world of work can be found in @tbl-wow-terms,
 with the search query requiring a term from the general column and one other column.
 ::: {#tbl-wow-terms}
 ```{python}
 #| label: tbl-wow-terms
 #| tbl-cap: World of work term cluster
@ -290,10 +293,6 @@ terms_wow = pd.read_csv("02-data/supplementary/terms_wow.csv")
 md(tabulate(terms_wow.fillna(""), showindex=False, headers="keys", tablefmt="grid"))
 ```
 World of work term cluster
 :::
 The world of work cluster, like the inequality and policy intervention clusters below, is made up of a general signifier (such as "work", "inequality" or "intervention") which has to be labelled in a study to form part of the sample,
 as well as any additional terms looking into one or multiple specific dimensions or categories of these signifiers (such as "domestic" work, "gender" inequality, "maternity leave" intervention).
 At least one general term and at least one additional term have to be mentioned by a study to be identified for the initial sample pool.
@ -303,21 +302,15 @@ Where terms have been identified from previous reviews outside the introduced IL
 their sources have been included in the table.
 For the database query, a single term from the general category is required to be included in addition to one term from *any* of the remaining categories.
 ::: {#tbl-intervention-terms}
 ```{python}
 #| label: tbl-intervention-terms
-#| tbl-cap: Intervention term cluster
+#| tbl-cap: Policy intervention term cluster
 terms_policy = pd.read_csv("02-data/supplementary/terms_policy.csv")
 # different headers to include 'social norms'
 headers = ["General", "Institutional", "Structural", "Agency & social norms"]
 md(tabulate(terms_policy.fillna(""), showindex=False, headers=headers, tablefmt="grid"))
 ```
 Policy intervention term cluster
 :::
 Lastly, the inequality cluster is once again made up of a general term describing inequality which has to form part of the query results, as well as at least one term describing a specific vertical or horizontal inequality,
 as seen in @tbl-inequality-terms.
@ -346,7 +339,7 @@ with a focus on the narrowing criteria specified in @tbl-inclusion-criteria.
 ::: {#tbl-inclusion-criteria}
 ```{python}
-#| label: tbl-inclusion-criteria
+#| label: inclusion-criteria
 inclusion_criteria = pd.read_csv("02-data/supplementary/inclusion-criteria.tsv", sep="\t")
 md(tabulate(inclusion_criteria, showindex=False, headers="keys", tablefmt="grid"))
@ -407,7 +400,7 @@ as can be seen in @fig-publications-per-year.
 #| fig-cap: Publications per year
 df_study_years = (
-    bib_df.groupby(["author", "year", "title"])
+    df.groupby(["author", "year", "title"])
    .first()
    .reset_index()
    .drop_duplicates()
@ -442,8 +435,8 @@ First, in general, citation counts are slightly decreasing --- as should general
 ```{python}
 #| label: fig-citations-per-year-avg
 #| fig-cap: Average citations per year
-bib_df["zot_cited"] = bib_df["zot_cited"].dropna().astype("int")
+df["zot_cited"] = df["zot_cited"].dropna().astype("int")
-grpd = bib_df.groupby(["year"], as_index=False)["zot_cited"].mean()
+grpd = df.groupby(["year"], as_index=False)["zot_cited"].mean()
 fig, ax = plt.subplots()
 ax.bar(grpd["year"], grpd["zot_cited"])
 sns.regplot(x=grpd["year"], y=grpd["zot_cited"], ax=ax)
@ -484,7 +477,7 @@ analysing the main findings per policy area, as well as underscore individual st
 #| fig-cap: Available studies by primary type of intervention
 by_intervention = (
-    bib_df
+    df
    .fillna("")
    .groupby(["author", "year", "title", "design", "method", "representativeness", "citation"])
    .agg(
@ -1112,7 +1105,7 @@ The authors suggest the primary channel is the newly increased bargaining power
 #| label: prep-inequalities-crosstabs
 # dataframe containing each intervention inequality pair
 df_inequality = (
-    bib_df[["region", "intervention", "inequality"]]
+    df[["region", "intervention", "inequality"]]
    .assign(
        Intervention = lambda _df: (_df["intervention"]
            .str.replace(r"\(.+\)", "", regex=True)
@ -1149,7 +1142,7 @@ in which fewer studies have been identified.
 #| fig-cap: Studies by regions analysed
 by_region = (
-    bib_df[["region"]]
+    df[["region"]]
    .assign(
        region = lambda _df: (_df["region"]
            .str.replace(r" ?; ?", ";", regex=True)
@ -1278,8 +1271,8 @@ many studies use income measurements and changes in income or income inequality
 ```{python}
 #| label: inequality-targeting-implicit-explicit
-targeting_majority = bib_df["targeting"].value_counts().index.tolist()[0]
+targeting_majority = df["targeting"].value_counts().index.tolist()[0]
-targeting_minority = bib_df["targeting"].value_counts().index.tolist()[-1]
+targeting_minority = df["targeting"].value_counts().index.tolist()[-1]
 ```
 Often, however, income inequality is not the primary inequality being targeted, but used to measure the effects on other inequalities by seeing how the effects of respective inequality and income intersect, as will be discussed in the following section.
@ -1291,7 +1284,7 @@ with only a minority of studies looking at policies with an `{python} targeting_
 #| fig-cap: Types of inequality analysed
 by_inequality = (
-    bib_df[["inequality"]]
+    df[["inequality"]]
    .assign(
        inequality = lambda _df: (_df["inequality"]
            .str.replace(r"\(.+\)", "", regex=True)
@ -1385,7 +1378,7 @@ with the other regions trailing further behind in output.
 #| fig-cap: Regional distribution of studies analysing gender inequalities
 by_region_and_inequality = (
-    bib_df[["inequality", "region"]]
+    df[["inequality", "region"]]
    .assign(
        region = lambda _df: (_df["region"]
            .str.replace(r" ?; ?", ";", regex=True)
@ -1755,7 +1748,7 @@ Internal validity ranking. Adapted from @Maitrot2017.
 ```{python}
 #| label: apptbl-extraction-matrix
 #| tbl-cap: Extraction matrix
-bib_df
+df
 ```
 {{< pagebreak >}}