From 12e1666959e10eed432aa2712d00043b5e4cf492 Mon Sep 17 00:00:00 2001 From: Marty Oehme Date: Thu, 20 Nov 2025 15:43:10 +0100 Subject: [PATCH] Update README with roadmap --- README.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/README.md b/README.md index 94b570e..8049430 100644 --- a/README.md +++ b/README.md @@ -5,3 +5,34 @@ - All inputs (i.e. building blocks from other sources) are located in `inputs/`. - All custom code is located in `code/`. + +## Questions + +Some interesting questions to pose + +1. Long-term growth + How many unique machines download packages per day, and is the growth linear, exponential, or flattening? + +2. Weekly rhythm + Does the number of unique downloaders follow a weekly cycle (week-day peaks vs. weekend dips)? + +3. Kernel lag + On average, how many days elapse between a new kernel being published upstream and the first time it appears in the logs? + *(Group kernels by major.minor, compute min(date) per kernel, compare with its official release date.)* + +4. Kernel longevity + Which kernel versions have the longest total lifespan (first → last appearance) and which ones disappear fastest? + +5. Top packages + Which five packages have the highest median daily download count across the whole period? + +6. Version stickiness + For packages with ≥10 versions, what fraction of users stay on the older version at least one week after a newer version becomes + available? + +7. Big-bang updates + Are there days when the total number of package downloads is >3σ above the 30-day rolling mean (indicating a bulk-update campaign)? + +8. File-size vs. activity + Is there a correlation between the size of the daily JSON snapshot and the number of unique downloaders? + *(Large files might mirror repository-wide rebuilds.)*