pkgdown/extra.scss

Skip to contents

Overview

Here the birth rate forecast according to the methodology of the FSO (Federal Statistical Office) is used. The first step is to prepare the model input data. This is described in this vignette. The starting point is past birth and population data. Three input data sets have to be prepared:

  • TFR (total fertility rate) per year
  • MAB (mean age of the mother at birth) per year
  • fertility rate per age year

If the final forecast should distinguish by nationality and/or spatial unit, the three input data sets must also be differentiated according to these variables.

Required data

You need birth and population data from the past to prepare the input data. If you don’t have such data at hand, you can run the functions with example data. Regardless of whether you bring your own or use example data, the structure of the data frames require the following variables:

data description variables required
birth data The historical number of births by females aggregated per demographic unit and year.
  • year e.g. 2010 to 2023 for the FSO’s birth data

  • spatial_unit one (e.g. canton) or several (e.g. municipalities) spatial units

  • nat nationality, suggestion: ‘ch’ (Swiss) and ‘int’ (international) as in the {propop} package, but other categories or only one level were also possible

  • age the age of mother at birth

  • births the number of births

population data Historical population records of females in a pre-defined age range (‘fertile age’, often 15 to 49) aggregated per spatial unit, year, age, and nationality.
  • year e.g. 2010 to 2023 for the FSO’s birth data

  • spatial_unit one (e.g. Canton) or several (e.g. municipalities) spatial units

  • nat nationality; suggestion: ‘ch’ (Swiss) and ‘int’ (international) as in the {propop} package, but other categories or only one level were also possible

  • age the age of females

  • n_pop number of people

Births

Currently, the FSO does not publish historical births data as open data. However, we have received permission to use and publish the data in propopbirth. In the propopbirth package, the original data from the FSO is preprocessed, adapting column names and factor levels, and the municipality numbers are replaced by municipality names.

The data for three selected municipalities looks like this:

# load package data
data("fso_birth", package = "propopbirth")

fso_birth |>
  dplyr::filter(spatial_unit %in% c("Aarau", "Frauenfeld", "Stadt Zürich")) |>
  DT::datatable(options = list(pageLength = 5))

Population

Historical population records from the FSO can be obtained with the function propopbirth::get_population_data, defining time span, spatial units and further specific arguments:

fso_pop <- get_population_data(
  number_fso = "px-x-0102010000_101",
  year_first = 2010,
  year_last = 2023,
  age_fert_min = 15,
  age_fert_max = 49,
  spatial_code = c("4001", "4566", "0261"),
  spatial_unit = c("Aarau", "Frauenfeld", "Stadt Zürich"),
  binational = TRUE
)

The data for three selected municipalities looks like this:

fso_pop |>
  DT::datatable(options = list(pageLength = 5))

Create input data

First, the mean annual population is calculated for females within the so-called ‘fertile age’ (usually 15 to 49). For each group (e.g. spatial unit, age, nationality) and year the mean population is calculated by the average of the population at the beginning and the end of the year.

Second, the births per year and group (e.g. spatial unit, age, nationality) are divided by the mean population (number of women of this group). This gives the age-specific fertility rate per year and group.

The TFR (total fertility rate) is calculated as the sum of the age-specific fertility rate over age per year and group (spatial unit, nationality).

The MAB (mean age of the mother at birth) is also computed based on the age-specific fertility rate per year. For this calculation a weighted average over age is used.

The parameter fert_hist_years determines how many years are used to calculate an average age-specific fertility rate. The FSO usually uses only one year for its cantonal calculations. However, age-specific fertility rates can vary substantially from year to year, especially for small spatial units. Therefore, it makes sense to take the average over multiple years. For this computation the births and population are averaged over the years; then the ratio between births and population is calculated.

input <- create_input_data(
  population = fso_pop,
  births = fso_birth |> 
    dplyr::filter(spatial_unit %in% c("Aarau", "Frauenfeld", "Stadt Zürich")),
  year_first = 2011,
  year_last = 2023,
  age_fert_min = 15,
  age_fert_max = 49,
  fert_hist_years = 3,
  binational = TRUE
) 

TFR

ggplot(input$tfr) +
  geom_line(aes(x = year, y = tfr, color = nat), linewidth = 0.7) +
  scale_color_manual(values = c("#ffa81f", "#A05388")) +
  labs(color = "Nationality", y = "TFR") +
  facet_wrap(~ spatial_unit) +
  theme_bw()

For historical data, the development of the total fertility rate (TFR) over  time is displayed by spatial unit and nationality.

input$tfr |> 
  DT::datatable(options = list(pageLength = 5))

MAB

ggplot(input$mab) +
  geom_line(aes(x = year, y = mab, color = nat), linewidth = 0.7) +
  scale_color_manual(values = c("#ffa81f", "#A05388")) +
  labs(color = "Nationality", y = "MAB") +
  facet_wrap(~ spatial_unit) +
  theme_bw()

For historical data, the development of the mean age of the mother at  birth (MAB) over time is displayed by spatial unit and nationality.

input$mab |> 
  DT::datatable(options = list(pageLength = 5))

Age-specific fertility rate

ggplot(input$fer) +
  geom_line(aes(x = age, y = fer, color = nat), linewidth = 0.7) +
  scale_color_manual(values = c("#ffa81f", "#A05388")) +
  labs(color = "Nationality", y = "Fertility rate") +
  facet_wrap(~ spatial_unit) +
  theme_bw()

For historical data, the age-specific fertility rate is displayed by  spatial unit and nationality (average over a selected number of years).

input$fer |> 
  DT::datatable(options = list(pageLength = 5))