# Project Popcorn Voidlinux Contains the gathered data of the statistics collection for all available dates (2025-09-24 as of today) in easier to work with CSV form. Data can be cleaned and processed with the available code. Any action can easily be started using [`just`](https://github.com/casey/just) with the available `justfile`. ## Dataset structure - All inputs (i.e. building blocks from other sources) are located in `input/`. - All custom code is located in `code/`. - All final output data is located in `output/` ## Output data structure ### Files Represents information about the individual JSON files available in the raw dataset. Contained in `files.csv`, 4 columns: - `date`: the date a specific file is relevant for - `filename`: the full filename as it exists in the `input/` directory - `mtime`: the last modification time of the file on the system - `filesize`: the size of the file, in bytes ### Kernels Represents information about the kernel versions represented in the raw dataset. Contained in `kernels.csv`, 3 columns: - `date`: the date a specific file is relevant for - `kernel`: the full kernel name that is available in the raw data, including major version, minor version and suffix - `downloads`: the amount of times the kernel has been seen on the observation date ### Packages Represents information about the package versions represented in the raw dataset. Contained in `packages.csv`, 4 columns: - `date`: the date a specific file is relevant for - `package`: the full package name as it is available in the raw data - `version`: the full package version as it is available in the raw data - `count`: the amount of times the package and version combination has been seen on the observation date ### Unique installs Represents information about the unique system installations represented in the raw dataset. Contained in `unique_installs.csv`, 2 columns: - `date`: the date a specific file is relevant for - `unique`: the amount of unique installations counted on the observation date