feat(script): Add term cluster tables

This commit is contained in:
Marty Oehme 2023-10-12 12:25:11 +02:00
parent bf350a18db
commit 8724c182fd
Signed by: Marty
GPG key ID: EDBF2ED917B2EF6A

View file

@ -295,12 +295,195 @@ The search protocol then follows a three-staged process of execution: identifica
First, in identification, the above categorizations are combined through Boolean operators to conduct a search through the database repository Web of Science.
The search itself is conducted with English-language search queries only.
<!-- TODO will we be using gray lit? -->
Relevant results are then complemented through the adoption of a snowballing technique, which analyses an array of published reviews for their reference lists to find cross-references of potentially missing literature.
Relevant results are then complemented through the adoption of a 'snowballing' technique, which analyses an array of published reviews for their reference lists to find cross-references of potentially missing literature.
To identify potential studies and create an initial sample, relevant terms for the clusters of world of work, inequality and policy interventions have been extracted from the existing reviews as well as the ILO definitions.
Identified terms comprising the world of work can be found in @tbl-wow-terms,
with the search query requiring a term from each column.
```{python}
#| label: tbl-wow-terms
#| tbl-cap: World of work term cluster
wow_terms_cluster = {
"General": pd.Series([
"work",
"labour",
"production of goods",
"provision of services",
"own-use",
"use by others",
"of working age",
"for pay",
"for profit",
"remuneration",
"market transactions"
]),
"Forms of work": pd.Series([
"own-use",
"employment",
"unpaid trainee",
"volunteer",
"other work activities",
"wage-employed",
"self-employed",
"formal work",
"informal work",
"domestic work",
"care work",
"unpaid work",
]),
"Labour market outcomes": pd.Series([
"employment outcomes",
"labour rights",
"equality of oppoertunity",
"equality of outcome",
"labour force participation [@Pinto2021]",
"labour force exit [@Silvaggi2020]",
"job quality [@Finlay2021]",
"career advancement [@Finlay2021]",
"hours worked [@Finlay2021]",
"wage",
"salary",
"return to work [@Silvaggi2020]",
])
}
df = pd.DataFrame(wow_terms_cluster)
md(tabulate(df.fillna(""), headers=wow_terms_cluster.keys(), showindex=False, tablefmt="grid"))
```
The world of work cluster, like the inequality and policy intervention clusters below, is made up of a general signifier (such as "work", "inequality" or "intervention") which has to be labelled in a study to form part of the sample,
as well as any additional terms looking into one or multiple specific dimensions or categories of these signifiers (such as "domestic" work, "gender" inequality, "maternity leave" intervention).
At least one general term and at least one additional term have to be mentioned by a study to be identified for the initial sample pool.
For the policy intervention cluster, a variety of terms have been identified both from the ILO policy areas and guidelines as well as existing reviews, as can be seen in @tbl-intervention-terms.
Where terms have been identified from previous reviews outside the introduced ILO policy guidelines,
there source has been included in the table.
For the database query, a single term from the general category is required to be included in addition to one term from *any* of the remaining categories.
```{python}
#| label: tbl-intervention-terms
#| tbl-cap: Policy intervention term cluster
policy_terms_cluster = {
"General" : pd.Series([
"intervention",
"policy",
"participation",
"targeting/targeted",
"distributive",
"redistributive",
]),
"Institutional" : pd.Series([
"support for childcare [@Perez2022]",
"labour rights",
"minimum wage",
"collective bargaining",
"business sustainability promotion",
"work-life balance promotion",
"equal pay for work of equal value",
"removal of (discriminatory) law",
"law reformation",
"guaranteed income [@Perez2022]",
"universal basic income [@Perez2022]",
"provision of living wage [@Perez2022]",
"maternity leave [@Chang2021]",
]),
"Structural" : pd.Series([
"cash benefits",
"services in kind",
"green transition",
"infrastructure",
"digital infrastructure",
"quality of education",
"public service improvement",
"lowering of gender segregation",
"price stability intervention",
"extended social protection scheme",
"comprehensive social protection",
"sustainable social protection",
"supported employment [@Lettieri2017]",
"vocational rehabilitation [@Silvaggi2020, @Lettieri2017]",
]),
"Agency" : pd.Series([
"credit programs [@Perez2022]",
"career guidance",
"vocational guidance [@Nevala2015]",
"vocational counselling [@Nevala2015]",
"counteracting of stereotypes",
"commuting subsidies [@Perez2022]",
"housing mobility programs [@Perez2022]",
"encouraging re-situation/migration [@Perez2022]",
"encouraging self-advocacy [@Nevala2015]",
"cognitive behavioural therapy [@Lettieri2017]",
"computer-assisted therapy [@Lettieri2017]",
"work organization [@Nevala2015]",
"special transportation [@Nevala2015]",
])
}
# different headers to include 'social norms'
headers = ["General", "Institutional", "Structural", "Agency & social norms"]
df = pd.DataFrame(policy_terms_cluster)
md(tabulate(df.fillna(""), headers=headers, showindex=False, tablefmt="grid"))
```
Lastly, the inequality cluster is once again made up of a general term describing inequality which has to form part of the query results, as well as at least one term describing a specific vertical or horizontal inequality,
as seen in @tbl-inequality-terms.
```{python}
#| label: tbl-inequality-terms
#| tbl-cap: Inequality term cluster
inequality_terms_cluster = {
"General": pd.Series([
"inequality",
"barrier",
"advantaged",
"disadvantaged",
"discriminated",
"disparity",
"horizontal inequality",
"vertical inequality",
]),
"Vertical": pd.Series([
"income",
"Palma ratio [@DFI2023]",
"Gini coefficient [@DFI2023]",
"class [@Kalasa2021]",
"fertility [@Kalasa2021]",
"bottom percentile",
"top percentile"
]),
"Horizontal": pd.Series([
"identity",
"demographic",
"gender",
"colour",
"beliefs",
"racial",
"ethnic",
"migrant",
"spatial",
"rural",
"urban",
"mega-cities",
"small cities",
"peripheral cities",
"age",
"nationality",
"ethnicity",
"health status",
"disability",
"characteristics",
])
}
df = pd.DataFrame(inequality_terms_cluster)
md(tabulate(df.fillna(""), headers=inequality_terms_cluster.keys(), showindex=False, tablefmt="grid"))
```
A general as well as category-specific term from each cluster will be required, using a intersection merge (Boolean 'AND'),
as well as in turn a single of those from each of the three clusters using an intersection merge.
The resulting sample pool will thus include a term and specific dimension of inequality and of policy intervention within the world of work.
Second, in screening, duplicate results are removed and the resulting literature sample is sorted based on a variety of excluding characteristics based on: language, title, abstract, full text and literature supersession through newer publications.
Properties in these characteristics are used to assess an individual study on its suitability for further review.
@ -322,17 +505,16 @@ An overview of the respective criteria used for inclusion or exclusion can be fo
: Study inclusion and exclusion scoping criteria {#tbl-inclusion-criteria}
To facilitate this screening process, a system of keywords is used to tag individual studies in the sample with their reason for exclusion, such as excluded::language, excluded::title, excluded::abstract, excluded::superseded.
This keyword-based system is equally used to further categorize the sample studies not falling into exclusion criteria, based on primary country of analysis, world region, as well as income level classification.
To facilitate the screening process, with the help of 'Zotero' reference manager a system of keywords is used to tag individual studies in the sample with their reason for exclusion,such as excluded::language, excluded::title, excluded::abstract, and excluded::superseded.
This keyword-based system is equally used to further categorize the sample studies that do not fall into exclusion criteria, based on primary country of analysis, world region, as well as income level classification.
To that end, a country::, region:: and income:: are used to disambiguate between the respective characteristics, such as region::LAC for Latin America and the Caribbean, region::SSA for Sub-Saharan Africa; as well as for example income::low-middle, income::upper-middle or income::high.
These two delineations follow the ILO categorizations on world regions and the country income classifications based on World Bank income groupings [@ILO2022].
Similarly, if a specific type of inequality, or a specific intervention, represents the focus of a study, these will be reflected in the same keyword system, through for example inequality::income or inequality::gender.
The complete process of identification and screening is undertaken with the help of the Zotero reference manager, ultimately leaving only publications which are relevant for final full-text review and analysis.
Last, for extraction, studies are screened for their full-texts, irrelevant studies excluded with excluded::full-text as explained above and relevant studies then ingested into the final sample pool.
All relevant information concerning both their major findings and statistical significance are then extracted from the individual studies into a collective results matrix.
All relevant data concerning both their major findings and statistical significance are then extracted from the individual studies into a collective results matrix.
The results to be identified in the matrix include a studys: i) key outcome measures (dependent variables), ii) main findings, iii) main policy interventions (independent variables), iv) study design and sample size, v) dataset and methods of evaluation, vi) direction of relation and level of representativeness, vii) level of statistical significance, viii) main limitations.
## Description of results