feat(script): Switch term tables to data driven design

Moved term tables to data directory as csv files and included them
in main script from there, to function as single source of truth.
This commit is contained in:
Marty Oehme 2023-12-06 17:35:37 +01:00
parent 92a1162dce
commit 6020d122b6
Signed by: Marty
GPG key ID: EDBF2ED917B2EF6A
5 changed files with 59 additions and 156 deletions

View file

@ -0,0 +1,21 @@
General,Vertical,Horizontal
inequality,income,identity
barrier,Palma ratio [@DFI2023],demographic
advantaged,Gini coefficient [@DFI2023],gender
disadvantaged,Log deviation,colour
discriminated,Theil,beliefs
disparity,Atkinson,racial
horizontal inequality,class [@Kalasa2021],ethnic
vertical inequality,fertility [@Kalasa2021],migrant
,bottom percentile,spatial
,top percentile,rural
,,urban
,,mega-cities
,,small cities
,,peripheral cities
,,age
,,nationality
,,ethnicity
,,health status
,,disability
,,characteristics
1 General Vertical Horizontal
2 inequality income identity
3 barrier Palma ratio [@DFI2023] demographic
4 advantaged Gini coefficient [@DFI2023] gender
5 disadvantaged Log deviation colour
6 discriminated Theil beliefs
7 disparity Atkinson racial
8 horizontal inequality class [@Kalasa2021] ethnic
9 vertical inequality fertility [@Kalasa2021] migrant
10 bottom percentile spatial
11 top percentile rural
12 urban
13 mega-cities
14 small cities
15 peripheral cities
16 age
17 nationality
18 ethnicity
19 health status
20 disability
21 characteristics

View file

@ -0,0 +1,16 @@
General,Institutional,Structural,Agency
intervention,support for childcare [@Perez2022],cash benefits,credit programs [@Perez2022]
policy,labour rights,services in kind,career guidance
participation,minimum wage,green transition,vocational guidance [@Nevala2015]
targeting/targeted,collective bargaining,infrastructure,vocational counselling [@Nevala2015]
distributive,business sustainability promotion,digital infrastructure,counteracting of stereotypes
redistributive,work-life balance promotion,quality of education,commuting subsidies [@Perez2022]
,equal pay for work of equal value,public service improvement,housing mobility programs [@Perez2022]
,removal of (discriminatory) law,lowering of gender segregation,encouraging re-situation/migration [@Perez2022]
,law reformation,price stability intervention,encouraging self-advocacy [@Nevala2015]
,social dialogue,extended social protection scheme,cognitive behavioural therapy [@Lettieri2017]
,guaranteed income [@Perez2022],comprehensive social protection,computer-assisted therapy [@Lettieri2017]
,universal basic income [@Perez2022],sustainable social protection,work organization [@Nevala2015]
,provision of living wage [@Perez2022],supported employment [@Lettieri2017],special transportation [@Nevala2015]
,maternity leave [@Chang2021],"vocational rehabilitation [@Silvaggi2020, @Lettieri2017]",collective action
,,unionization,
1 General Institutional Structural Agency
2 intervention support for childcare [@Perez2022] cash benefits credit programs [@Perez2022]
3 policy labour rights services in kind career guidance
4 participation minimum wage green transition vocational guidance [@Nevala2015]
5 targeting/targeted collective bargaining infrastructure vocational counselling [@Nevala2015]
6 distributive business sustainability promotion digital infrastructure counteracting of stereotypes
7 redistributive work-life balance promotion quality of education commuting subsidies [@Perez2022]
8 equal pay for work of equal value public service improvement housing mobility programs [@Perez2022]
9 removal of (discriminatory) law lowering of gender segregation encouraging re-situation/migration [@Perez2022]
10 law reformation price stability intervention encouraging self-advocacy [@Nevala2015]
11 social dialogue extended social protection scheme cognitive behavioural therapy [@Lettieri2017]
12 guaranteed income [@Perez2022] comprehensive social protection computer-assisted therapy [@Lettieri2017]
13 universal basic income [@Perez2022] sustainable social protection work organization [@Nevala2015]
14 provision of living wage [@Perez2022] supported employment [@Lettieri2017] special transportation [@Nevala2015]
15 maternity leave [@Chang2021] vocational rehabilitation [@Silvaggi2020, @Lettieri2017] collective action
16 unionization

View file

@ -0,0 +1,13 @@
General,Forms of work,Labour market outcomes
work,own-use,employment outcomes
labour,employment,labour rights
production of goods,unpaid trainee,equality of opportunity
provision of services,volunteer,equality of outcome
own-use,other work activities,labour force participation [@Pinto2021]
use by others,wage-employed,labour force exit [@Silvaggi2020]
of working age,self-employed,job quality [@Finlay2021]
for pay,formal work,career advancement [@Finlay2021]
for profit,informal work,hours worked [@Finlay2021]
remuneration,domestic work,wage
market transactions,care work,salary
,unpaid work,return to work [@Silvaggi2020]
1 General Forms of work Labour market outcomes
2 work own-use employment outcomes
3 labour employment labour rights
4 production of goods unpaid trainee equality of opportunity
5 provision of services volunteer equality of outcome
6 own-use other work activities labour force participation [@Pinto2021]
7 use by others wage-employed labour force exit [@Silvaggi2020]
8 of working age self-employed job quality [@Finlay2021]
9 for pay formal work career advancement [@Finlay2021]
10 for profit informal work hours worked [@Finlay2021]
11 remuneration domestic work wage
12 market transactions care work salary
13 unpaid work return to work [@Silvaggi2020]

View file

@ -457,6 +457,9 @@ to extraction metadata sheet.
## Search Term clusters
These lists have been used to create data-driven term cluster files in the supplementary data directory.
The lists have been kept here for historic documentation but should not be used for up-to-date term changes, use the csv files instead.
### World-of-work cluster
- ILO:

View file

@ -323,52 +323,8 @@ with the search query requiring a term from the general column and one other col
```{python}
#| label: tbl-wow-terms
#| tbl-cap: World of work term cluster
wow_terms_cluster = {
"General": pd.Series([
"work",
"labour",
"production of goods",
"provision of services",
"own-use",
"use by others",
"of working age",
"for pay",
"for profit",
"remuneration",
"market transactions"
]),
"Forms of work": pd.Series([
"own-use",
"employment",
"unpaid trainee",
"volunteer",
"other work activities",
"wage-employed",
"self-employed",
"formal work",
"informal work",
"domestic work",
"care work",
"unpaid work",
]),
"Labour market outcomes": pd.Series([
"employment outcomes",
"labour rights",
"equality of opportunity",
"equality of outcome",
"labour force participation [@Pinto2021]",
"labour force exit [@Silvaggi2020]",
"job quality [@Finlay2021]",
"career advancement [@Finlay2021]",
"hours worked [@Finlay2021]",
"wage",
"salary",
"return to work [@Silvaggi2020]",
])
}
df = pd.DataFrame(wow_terms_cluster)
md(tabulate(df.fillna(""), headers=[wow_terms_cluster.keys()], showindex=False, tablefmt="grid"))
terms_wow = pd.read_csv("02-data/supplementary/terms_wow.csv")
md(tabulate(terms_wow.fillna(""), showindex=False, headers="keys", tablefmt="grid"))
```
The world of work cluster, like the inequality and policy intervention clusters below, is made up of a general signifier (such as "work", "inequality" or "intervention") which has to be labelled in a study to form part of the sample,
@ -383,69 +339,10 @@ For the database query, a single term from the general category is required to b
```{python}
#| label: tbl-intervention-terms
#| tbl-cap: Policy intervention term cluster
policy_terms_cluster = {
"General" : pd.Series([
"intervention",
"policy",
"participation",
"targeting/targeted",
"distributive",
"redistributive",
]),
"Institutional" : pd.Series([
"support for childcare [@Perez2022]",
"labour rights",
"minimum wage",
"collective bargaining",
"business sustainability promotion",
"work-life balance promotion",
"equal pay for work of equal value",
"removal of (discriminatory) law",
"law reformation",
"social dialogue",
"guaranteed income [@Perez2022]",
"universal basic income [@Perez2022]",
"provision of living wage [@Perez2022]",
"maternity leave [@Chang2021]",
]),
"Structural" : pd.Series([
"cash benefits",
"services in kind",
"green transition",
"infrastructure",
"digital infrastructure",
"quality of education",
"public service improvement",
"lowering of gender segregation",
"price stability intervention",
"extended social protection scheme",
"comprehensive social protection",
"sustainable social protection",
"supported employment [@Lettieri2017]",
"vocational rehabilitation [@Silvaggi2020, @Lettieri2017]",
"unionization",
]),
"Agency" : pd.Series([
"credit programs [@Perez2022]",
"career guidance",
"vocational guidance [@Nevala2015]",
"vocational counselling [@Nevala2015]",
"counteracting of stereotypes",
"commuting subsidies [@Perez2022]",
"housing mobility programs [@Perez2022]",
"encouraging re-situation/migration [@Perez2022]",
"encouraging self-advocacy [@Nevala2015]",
"cognitive behavioural therapy [@Lettieri2017]",
"computer-assisted therapy [@Lettieri2017]",
"work organization [@Nevala2015]",
"special transportation [@Nevala2015]",
"collective action",
])
}
terms_policy = pd.read_csv("02-data/supplementary/terms_policy.csv")
# different headers to include 'social norms'
headers = ["General", "Institutional", "Structural", "Agency & social norms"]
df = pd.DataFrame(policy_terms_cluster)
md(tabulate(df.fillna(""), headers=headers, showindex=False, tablefmt="grid"))
md(tabulate(terms_policy.fillna(""), showindex=False, headers=headers, tablefmt="grid"))
```
Lastly, the inequality cluster is once again made up of a general term describing inequality which has to form part of the query results, as well as at least one term describing a specific vertical or horizontal inequality,
@ -454,55 +351,8 @@ as seen in @tbl-inequality-terms.
```{python}
#| label: tbl-inequality-terms
#| tbl-cap: Inequality term cluster
inequality_terms_cluster = {
"General": pd.Series([
"inequality",
"barrier",
"advantaged",
"disadvantaged",
"discriminated",
"disparity",
"horizontal inequality",
"vertical inequality",
]),
"Vertical": pd.Series([
"income",
"Palma ratio [@DFI2023]",
"Gini coefficient [@DFI2023]",
"Log deviation",
"Theil",
"Atkinson",
"class [@Kalasa2021]",
"fertility [@Kalasa2021]",
"bottom percentile",
"top percentile"
]),
"Horizontal": pd.Series([
"identity",
"demographic",
"gender",
"colour",
"beliefs",
"racial",
"ethnic",
"migrant",
"spatial",
"rural",
"urban",
"mega-cities",
"small cities",
"peripheral cities",
"age",
"nationality",
"ethnicity",
"health status",
"disability",
"characteristics",
])
}
df = pd.DataFrame(inequality_terms_cluster)
md(tabulate(df.fillna(""), headers=inequality_terms_cluster.keys(), showindex=False, tablefmt="grid"))
terms_inequality = pd.read_csv("02-data/supplementary/terms_inequality.csv")
md(tabulate(terms_inequality.fillna(""), showindex=False, headers="keys", tablefmt="grid"))
```
A general as well as category-specific term from each cluster will be required, using a intersection merge (Boolean 'AND'),