A full CSV representation of the Voidlinux popcorn data from https://popcorn.voidlinux.org
Find a file
2025-10-01 10:25:04 +02:00
.datalad [DATALAD] new dataset 2025-09-30 16:54:46 +02:00
input Remove empty raw 0byte files 2025-09-30 20:45:29 +02:00
output [DATALAD RUNCMD] Create updated output data 2025-09-30 20:53:50 +02:00
src Validate CSV output schemas 2025-10-01 10:23:10 +02:00
.gitattributes Add uv skeleton 2025-09-30 21:40:18 +02:00
.gitignore Add validation dependencies to venv 2025-10-01 10:22:54 +02:00
.python-version Add uv skeleton 2025-09-30 21:40:18 +02:00
CHANGELOG.md Update README.md and CHANGELOG.md 2025-09-30 21:13:19 +02:00
justfile Be explicit about datalad run inputs and outputs 2025-10-01 10:25:04 +02:00
pyproject.toml Add validation dependencies to venv 2025-10-01 10:22:54 +02:00
README.md Validate CSV output schemas 2025-10-01 10:23:10 +02:00
uv.lock Add validation dependencies to venv 2025-10-01 10:22:54 +02:00

Project Popcorn Voidlinux

Contains the gathered data of the https://popcorn.voidlinux.org statistics collection for all available dates (2025-09-24 as of today) in easier to work with CSV form.

Data can be cleaned and processed with the available code. Any action can easily be started using just with the available justfile.

Dataset structure

  • All inputs (i.e. building blocks from other sources) are located in input/.
  • All custom code is located in src/.
  • All final output data is located in output/

Output data structure

Files

Represents information about the individual JSON files available in the raw dataset.

Contained in files.csv, 4 columns:

  • date: the date a specific file is relevant for
  • filename: the full filename as it exists in the input/ directory
  • mtime: the last modification time of the file on the system
  • filesize: the size of the file, in bytes

Kernels

Represents information about the kernel versions represented in the raw dataset.

Contained in kernels.csv, 3 columns:

  • date: the date a specific file is relevant for
  • kernel: the full kernel name that is available in the raw data, including major version, minor version and suffix
  • downloads: the amount of times the kernel has been seen on the observation date

Packages

Represents information about the package versions represented in the raw dataset.

Contained in packages.csv, 4 columns:

  • date: the date a specific file is relevant for
  • package: the full package name as it is available in the raw data
  • version: the full package version as it is available in the raw data
  • count: the amount of times the package and version combination has been seen on the observation date

Unique installs

Represents information about the unique system installations represented in the raw dataset.

Contained in unique_installs.csv, 2 columns:

  • date: the date a specific file is relevant for
  • unique: the amount of unique installations counted on the observation date