Add kernel longevity section
This commit is contained in:
parent
6005f140f1
commit
2f6a7c9af6
2 changed files with 111 additions and 9 deletions
16
README.md
16
README.md
|
|
@ -10,29 +10,29 @@
|
||||||
|
|
||||||
Some interesting questions to pose
|
Some interesting questions to pose
|
||||||
|
|
||||||
1. Long-term growth
|
1. [ ] Long-term growth
|
||||||
How many unique machines download packages per day, and is the growth linear, exponential, or flattening?
|
How many unique machines download packages per day, and is the growth linear, exponential, or flattening?
|
||||||
|
|
||||||
2. Weekly rhythm
|
2. [x] Weekly rhythm
|
||||||
Does the number of unique downloaders follow a weekly cycle (week-day peaks vs. weekend dips)?
|
Does the number of unique downloaders follow a weekly cycle (week-day peaks vs. weekend dips)?
|
||||||
|
|
||||||
3. Kernel lag
|
3. [ ] Kernel lag
|
||||||
On average, how many days elapse between a new kernel being published upstream and the first time it appears in the logs?
|
On average, how many days elapse between a new kernel being published upstream and the first time it appears in the logs?
|
||||||
*(Group kernels by major.minor, compute min(date) per kernel, compare with its official release date.)*
|
*(Group kernels by major.minor, compute min(date) per kernel, compare with its official release date.)*
|
||||||
|
|
||||||
4. Kernel longevity
|
4. [x] Kernel longevity
|
||||||
Which kernel versions have the longest total lifespan (first → last appearance) and which ones disappear fastest?
|
Which kernel versions have the longest total lifespan (first → last appearance) and which ones disappear fastest?
|
||||||
|
|
||||||
5. Top packages
|
5. [ ] Top packages
|
||||||
Which five packages have the highest median daily download count across the whole period?
|
Which five packages have the highest median daily download count across the whole period?
|
||||||
|
|
||||||
6. Version stickiness
|
6. [ ] Version stickiness
|
||||||
For packages with ≥10 versions, what fraction of users stay on the older version at least one week after a newer version becomes
|
For packages with ≥10 versions, what fraction of users stay on the older version at least one week after a newer version becomes
|
||||||
available?
|
available?
|
||||||
|
|
||||||
7. Big-bang updates
|
7. [ ] Big-bang updates
|
||||||
Are there days when the total number of package downloads is >3σ above the 30-day rolling mean (indicating a bulk-update campaign)?
|
Are there days when the total number of package downloads is >3σ above the 30-day rolling mean (indicating a bulk-update campaign)?
|
||||||
|
|
||||||
8. File-size vs. activity
|
8. [ ] File-size vs. activity
|
||||||
Is there a correlation between the size of the daily JSON snapshot and the number of unique downloaders?
|
Is there a correlation between the size of the daily JSON snapshot and the number of unique downloaders?
|
||||||
*(Large files might mirror repository-wide rebuilds.)*
|
*(Large files might mirror repository-wide rebuilds.)*
|
||||||
|
|
|
||||||
104
index.md
104
index.md
|
|
@ -220,7 +220,7 @@ Indeed, there is very little variation between the week days (Mon-Fri, 1-5) and
|
||||||
In fact, the only day on which repository interactions rise a little seems to be Tuesday,
|
In fact, the only day on which repository interactions rise a little seems to be Tuesday,
|
||||||
which is surprising.
|
which is surprising.
|
||||||
|
|
||||||
Well, corroborate this with my own statistics!
|
Well, let's corroborate this with my own statistics!
|
||||||
I use [`atuin`](https://atuin.sh/) to track my shell history,
|
I use [`atuin`](https://atuin.sh/) to track my shell history,
|
||||||
which can be queried with `atuin history list`.
|
which can be queried with `atuin history list`.
|
||||||
|
|
||||||
|
|
@ -312,4 +312,106 @@ Curiously, I can also glean from the list above that I have indeed _never_ updat
|
||||||
|
|
||||||
## Kernel longevity
|
## Kernel longevity
|
||||||
|
|
||||||
|
Another question that I find quite interesting is this:
|
||||||
|
How long were the various kernel versions in use?
|
||||||
|
Or, more precisely, which ones are the versions that have the longest 'life-spans' in the repository, or the shortest ones?
|
||||||
|
|
||||||
|
But first, let's investigate the overall download numbers per kernel.
|
||||||
|
|
||||||
|
For this we'll use the `kernels.csv` file, so let's take a look.
|
||||||
|
|
||||||
|
| date | kernel | downloads |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| 2025-11-20 | 6.17.7_1 | 6 |
|
||||||
|
| 2025-11-20 | 6.17.8-tkg-bore-alderlake_1 | 1 |
|
||||||
|
| 2025-11-20 | 6.17.8-tkg-bore-zen_1 | 1 |
|
||||||
|
| 2025-11-20 | 6.17.8_1 | 12 |
|
||||||
|
| 2025-11-20 | 6.6.111_1 | 1 |
|
||||||
|
| 2025-11-20 | 6.6.116_1 | 3 |
|
||||||
|
| 2025-11-20 | 6.6.65_1 | 1 |
|
||||||
|
| 2025-11-20 | 6.6.87.2-microsoft-standard-WSL2 | 1 |
|
||||||
|
|
||||||
|
This file is almost perfectly usable as-is, but I am only interested in the actual kernel versions,
|
||||||
|
so the first three version dots (e.g. `6.17.7`).
|
||||||
|
I don't care about the void-internal release version (the `_1`),
|
||||||
|
nor the weird custom-compiled kernels people are using (e.g. `tkg-bore-alderlake_1`).
|
||||||
|
But since I also don't want to straight drop them from the data,
|
||||||
|
we'll do a little regex string substitution:
|
||||||
|
|
||||||
|
```nu
|
||||||
|
mkdir outputs
|
||||||
|
open input/popcorn/output/kernels.csv |
|
||||||
|
update kernel { str replace --regex '^(\d.\d+.\d+).*' "$1"} |
|
||||||
|
group-by --to-table kernel |
|
||||||
|
save outputs/kernels_standardized.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Here we remove anything that is not part of the version string by essentially replacing the whole line with just the version itself.
|
||||||
|
This process takes a while for the over 57.000 lines contained in the file,
|
||||||
|
so I am saving an intermediate output version that I'll use for the next steps.
|
||||||
|
|
||||||
|
We'll start by summing up the absolute numbers of kernel uses per version,
|
||||||
|
of which we can keep the top 5:
|
||||||
|
|
||||||
|
```nu
|
||||||
|
open output/kernels_standardized.json | update items { $in.downloads | math sum } | sort-by items | last 10
|
||||||
|
```
|
||||||
|
|
||||||
|
This show us that:
|
||||||
|
|
||||||
|
| kernel | items |
|
||||||
|
| --- | --- |
|
||||||
|
| 6.1.31 | 1340 |
|
||||||
|
| 5.8.18 | 1674 |
|
||||||
|
| 6.12.41 | 1744 |
|
||||||
|
| 5.13.19 | 2500 |
|
||||||
|
| 6.3.13 | 2624 |
|
||||||
|
|
||||||
|
The kernel that was run the most in terms of _absolute numbers_ was kernel version 6.3.13,
|
||||||
|
with 5.13.19 coming up relatively closely behind.
|
||||||
|
The other kernels are trailing somewhat further behind with the next kernel having almost 1.000 fewer uses.
|
||||||
|
|
||||||
|
But I originally wanted to know about the _longest lived_ kernel in these data,
|
||||||
|
so how do we extract that?
|
||||||
|
|
||||||
|
We'll take the grouped `json` file and do a similar aggregation as up above,
|
||||||
|
except creating a new column for the first (`math min`) and last (`math max`) appearance of each kernel version.
|
||||||
|
Then we can take those two and,
|
||||||
|
since they are of type `datetime`,
|
||||||
|
simply subtract one from the other to get the total `duration` that the respective kernel appeared in the data.
|
||||||
|
|
||||||
|
```nu
|
||||||
|
open output/kernels_standardized.json |
|
||||||
|
insert first { $in.items.date | math min } |
|
||||||
|
insert last { $in.items.date | math max } |
|
||||||
|
reject items |
|
||||||
|
into datetime first last |
|
||||||
|
insert delta {$in.last - $in.first } |
|
||||||
|
sort-by delta |
|
||||||
|
last 10
|
||||||
|
```
|
||||||
|
|
||||||
|
By sorting on the delta value and keeping the last ones we have essentially filtered for the 'longest'-lived kernel versions,
|
||||||
|
leaving us with the following:
|
||||||
|
|
||||||
|
| kernel | first | last | delta |
|
||||||
|
| --- | --- | --- | --- |
|
||||||
|
| 6.1.6 | Mon, 16 Jan 2023 00:00:00 +0100 (2 years ago) | Sat, 5 Apr 2025 00:00:00 +0200 (7 months ago) | 115wk 4day 23hr |
|
||||||
|
| 4.19.59 | Wed, 17 Jul 2019 00:00:00 +0200 (6 years ago) | Fri, 28 Jan 2022 00:00:00 +0100 (3 years ago) | 132wk 2day 1hr |
|
||||||
|
| 5.10.9 | Fri, 22 Jan 2021 00:00:00 +0100 (4 years ago) | Tue, 12 Sep 2023 00:00:00 +0200 (2 years ago) | 137wk 3day 23hr |
|
||||||
|
| 5.19.14 | Thu, 13 Oct 2022 00:00:00 +0200 (3 years ago) | Tue, 5 Aug 2025 00:00:00 +0200 (3 months ago) | 146wk 5day |
|
||||||
|
| 5.15.36 | Fri, 29 Apr 2022 00:00:00 +0200 (3 years ago) | Mon, 3 Mar 2025 00:00:00 +0100 (8 months ago) | 148wk 3day 1hr |
|
||||||
|
| 5.13.8 | Fri, 6 Aug 2021 00:00:00 +0200 (4 years ago) | Fri, 2 Aug 2024 00:00:00 +0200 (a year ago) | 156wk |
|
||||||
|
| 5.13.10 | Sat, 14 Aug 2021 00:00:00 +0200 (4 years ago) | Sun, 15 Sep 2024 00:00:00 +0200 (a year ago) | 161wk 1day |
|
||||||
|
| 5.12.13 | Sat, 26 Jun 2021 00:00:00 +0200 (4 years ago) | Mon, 23 Sep 2024 00:00:00 +0200 (a year ago) | 169wk 2day |
|
||||||
|
| 5.11.22 | Fri, 21 May 2021 00:00:00 +0200 (4 years ago) | Sun, 22 Sep 2024 00:00:00 +0200 (a year ago) | 174wk 2day |
|
||||||
|
| 5.2.13 | Sat, 7 Sep 2019 00:00:00 +0200 (6 years ago) | Sun, 7 Sep 2025 00:00:00 +0200 (2 months ago) | 313wk 1day |
|
||||||
|
|
||||||
|
We can see that especially kernel version 5 was long-lived,
|
||||||
|
with version 5.2.13 being in use for just over 6 _years_.
|
||||||
|
The exact nature of the time frame (September 7 to September 7) makes me think this may be some sort of automated installation.
|
||||||
|
|
||||||
|
Without skipping ahead too much, this makes sense to me looking at the wider picture,
|
||||||
|
as the `popcorn` statistics gathering was introduced in the middle of kernel 4's existence,
|
||||||
|
and we are not yet anywhere near the end of the kernel 6 life-span,
|
||||||
|
so version 5 probably had the most opportunity to have long-running installations.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue