| .datalad | ||
| code | ||
| input | ||
| output | ||
| .gitattributes | ||
| .gitignore | ||
| .python-version | ||
| CHANGELOG.md | ||
| justfile | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Project Popcorn Voidlinux
Contains the gathered data of the https://popcorn.voidlinux.org statistics collection for all available dates (2025-09-24 as of today) in easier to work with CSV form.
Data can be cleaned and processed with the available code.
Any action can easily be started using just with the available justfile.
Dataset structure
- All inputs (i.e. building blocks from other sources) are located in
input/. - All custom code is located in
src/. - All final output data is located in
output/
Output data structure
Files
Represents information about the individual JSON files available in the raw dataset.
Contained in files.csv, 4 columns:
date: the date a specific file is relevant forfilename: the full filename as it exists in theinput/directorymtime: the last modification time of the file on the systemfilesize: the size of the file, in bytes
Kernels
Represents information about the kernel versions represented in the raw dataset.
Contained in kernels.csv, 3 columns:
date: the date a specific file is relevant forkernel: the full kernel name that is available in the raw data, including major version, minor version and suffixdownloads: the amount of times the kernel has been seen on the observation date
Packages
Represents information about the package versions represented in the raw dataset.
Contained in packages.csv, 4 columns:
date: the date a specific file is relevant forpackage: the full package name as it is available in the raw dataversion: the full package version as it is available in the raw datacount: the amount of times the package and version combination has been seen on the observation date
Unique installs
Represents information about the unique system installations represented in the raw dataset.
Contained in unique_installs.csv, 2 columns:
date: the date a specific file is relevant forunique: the amount of unique installations counted on the observation date