Add kernel text
This commit is contained in:
parent
9687eb662b
commit
9e3726402d
1 changed files with 113 additions and 38 deletions
113
popcorn.qmd
113
popcorn.qmd
|
|
@ -109,6 +109,7 @@ or there was some kind of issue on the backend so the stats for those days are
|
|||
lost.
|
||||
|
||||
<!-- TODO: is this still true? -->
|
||||
|
||||
We take a look at the missing days
|
||||
among other things at the end of this article.
|
||||
|
||||
|
|
@ -161,7 +162,8 @@ something happened to data collection or everybody collectively decided to
|
|||
leave their PC offline just for that day, but the numbers are back to normal
|
||||
the day after.[^independence-day]
|
||||
|
||||
[^independence-day]: I suppose one interpretation would be people taking their
|
||||
[^independence-day]:
|
||||
I suppose one interpretation would be people taking their
|
||||
4th of July celebrations very seriously, and thus not being present in the
|
||||
statistics for the day after. However, I am not sure if this would reflect so
|
||||
strongly in data collection, and it additionally pre-supposes the data
|
||||
|
|
@ -209,9 +211,14 @@ may signify new users checking out Void Linux and downloading a large variety
|
|||
of packages in the process.
|
||||
|
||||
<!-- TODO: still accurate? -->
|
||||
|
||||
For a breakdown of the absolute numbers of packages on systems by weekday and
|
||||
month of the year instead of over time, see the Appendix below.
|
||||
|
||||
An interesting trend is visible toward the end of the timeline window, with a
|
||||
rapid decline in package numbers per user. It is too early for to clearly see
|
||||
if this is just variability or an actual trend in the data.
|
||||
|
||||
Beyond pure installation numbers, let's take a look at the actual top-installed
|
||||
packages on users' systems.
|
||||
|
||||
|
|
@ -226,7 +233,8 @@ The top packages are unsurprisingly
|
|||
the `base-system` and `xtools` packages, followed by `wget`, `htop` and
|
||||
`rsync`.[^popcorn-removal]
|
||||
|
||||
[^popcorn-removal]: I have removed the PopCorn package itself from the data.
|
||||
[^popcorn-removal]:
|
||||
I have removed the PopCorn package itself from the data.
|
||||
Funnily enough, since _everybody_ who is represented in the data has to have
|
||||
PopCorn installed or the data wouldn't be collected in the first place, if we
|
||||
extrapolate from the collected data naively this means more people have PopCorn
|
||||
|
|
@ -278,7 +286,8 @@ On the Y-axis we see the amount of packages while on the X-axis we see the amoun
|
|||
What this means is that we see _how often_ packages tend to be installed,
|
||||
and where the majority of packages is grouped.[^density-approximation]
|
||||
|
||||
[^density-approximation]: In the package density count above, since we are
|
||||
[^density-approximation]:
|
||||
In the package density count above, since we are
|
||||
accumulating over the absolute numbers of all installations of all users, the
|
||||
overall high numbers are really _high_, i.e. above 150,000. Since we are
|
||||
sorting the package counts into a finite number of bins to make visualizing it
|
||||
|
|
@ -318,21 +327,72 @@ packages between eleven and 20 installations, and
|
|||
`python f"{get_num(twenty_thirty):,}"` packages between 21 and 30 installations.
|
||||
`python f"{get_num(thirty_plus):,}"` packages have over 30 installations.
|
||||
|
||||
For now, these are the explorations I have done for the package data collected.
|
||||
I think it is interesting to see, especially the evolution of package installations over time,
|
||||
and per user,
|
||||
as well as getting a glimpse of the most used packages in the sample.
|
||||
|
||||
But there are yet more things to explore in the statistics overall.
|
||||
|
||||
## Kernel Analysis
|
||||
|
||||
Beyond package numbers, the data also encapsulate information about the Linux
|
||||
kernels used by Void Linux users.
|
||||
The files report the exact kernel version users are running, including the major version,
|
||||
minor versions, and any suffixes as well.
|
||||
|
||||
For example, there are many reports containing the `4.19.0-9-amd64` kernel, or
|
||||
some containing the `6.1.53-1-lts` kernel, or `6.11.2-asahi-6.11.2-1_4`. These
|
||||
are 'extraordinary' kernels in my opinion, and they do not follow clear naming
|
||||
patterns. For the purposes of the following visualizations any such suffixes
|
||||
have been cut off, looking only at the versioning of the main kernels
|
||||
themselves.
|
||||
|
||||
Let's start by looking at the prevalence of the different major versions.
|
||||
|
||||
```{python}
|
||||
from notebooks.popcorn import plt_kernel_versions
|
||||
pplot(plt_kernel_versions)
|
||||
```
|
||||
|
||||
When looking at the kernel versions used, we see a very strong jump between major kernel version
|
||||
4 and major kernel version 5.
|
||||
This is an accumulation of the three major versions used during the collected timeline,
|
||||
over the _whole_ time as absolute numbers.
|
||||
|
||||
For this analysis we had to exclude {kernel_df_v99.select(pl.len()).item()} rows which were
|
||||
apparently from the future, as they were running variations of major kernel version 99. In all
|
||||
likelihood there is a custom kernel version out there which reports its own major version as 99.
|
||||
The strange version starts appearing on {kernel_df_v99.select("date").row(0)0} and shows up
|
||||
all the way until {kernel_df_v99.select("date").row(-1)[0]}.
|
||||
When looking at the kernel versions used, we see a very strong jump between major kernel version
|
||||
4 and major kernel version 5, with version 4 being significantly less prevalent in the data.
|
||||
|
||||
Of course, this makes sense from a release standpoint: kernel version 5.0 was
|
||||
released in March 2019, just a single year after the start of data collection.[^kernel-releases]
|
||||
Additionally, as we established above, this was also the time of the fewest
|
||||
unique data reports, so the absolute amount of kernel 4 reports is even
|
||||
smaller.
|
||||
|
||||
[^kernel-releases]:
|
||||
Data collection began in May 2018.
|
||||
All information on the kernel release timelines is taken
|
||||
from the nicely comprehensive _Linux Kernel Version History_ Wikipedia page:
|
||||
<https://en.wikipedia.org/wiki/Linux_kernel_version_history>.
|
||||
|
||||
Kernel version 5 still provides the dominant amount of reported kernel versions,
|
||||
but just barely. This makes sense since major version 6.0 was released in October 2022.
|
||||
It has thus been just over three years of version 5 being the latest kernel,
|
||||
and almost exactly three years of version 6 being the latest kernel.
|
||||
|
||||
Again, we have to keep the curve of unique installations in mind for absolute numbers like these:
|
||||
Kernel 5 was released right as the massive increase in unique Void Linux installation reports happened,
|
||||
and kernel 6 right after the report slump happened.
|
||||
This, in all likelihood, accounts for the slight imbalance between the numbers,
|
||||
and will shift over the coming months.
|
||||
|
||||
Just like with kernel suffixes, for this analysis we also had to exclude
|
||||
{kernel_df_v99.select(pl.len()).item()} rows which were apparently from the
|
||||
future --- as they were running variations of major kernel version 99. In all
|
||||
likelihood there is a custom compiled kernel version out there which reports its own
|
||||
major version as 99. The strange version starts appearing on
|
||||
{kernel_df_v99.select("date").row(0)0} and shows up all the way until
|
||||
{kernel_df_v99.select("date").row(-1)[0]}.
|
||||
|
||||
Let's turn to the actual adoption of kernels over time in the next visualization.
|
||||
|
||||
```{python}
|
||||
from notebooks.popcorn import plt_kernel_timeline
|
||||
|
|
@ -356,16 +416,31 @@ last_kernel5: date = weekly_kernel_df.filter(pl.col("major_ver") == "5")[-1][
|
|||
].item()
|
||||
```
|
||||
|
||||
A timeline analysis of the kernels used to report daily downloads shows that people generally
|
||||
adopt new major kernel versons at roughly the same time. This change is especially stark between
|
||||
major kernel versions 5 and 6, which seem to have traded place in usage almost over night.
|
||||
A timeline analysis of the prevalent kernels in the data shows that new major
|
||||
kernel version are adopted relatively rapidly and with the majority of switches
|
||||
occuring at roughly the same time.
|
||||
|
||||
The first time that major version 5 of the kernel shows up is on {first_kernel5}. From here, it
|
||||
took a long time for the last of the version 4 kernels to disappear, coinciding with the big
|
||||
switch between major version 5 and 6. The last time a major version 4 is seen is on
|
||||
{last_kernel4}, while the last major version 5 kernels still pop up.
|
||||
It would seem, then, that the people still running kernel version 4 used the opportunity of
|
||||
everybody switching to the stable version of 6 to also upgrade their machines.
|
||||
This change is especially stark between major kernel versions 5 and 6, which
|
||||
seem to have traded place in usage almost over night. A reasonable speculation
|
||||
for this rapid switch is that the `linux` kernel meta-package was pointed at
|
||||
the new version at that time, so each update pulled the new kernel.
|
||||
|
||||
The first time that major version 5 of the kernel shows up is on
|
||||
{first_kernel5}. From here, it took a long time for the last of the version 4
|
||||
kernels to disappear. Interestingly, this roughly coincides with the big switch
|
||||
between major version 5 and 6. The last time a major version 4 is seen is on
|
||||
{last_kernel4}, while the last major version 5 kernels still pop up. It would
|
||||
seem, then, that the people still running kernel version 4 used the opportunity
|
||||
of everybody switching to the stable version of 6 to also upgrade their
|
||||
machines.
|
||||
|
||||
If we cautiously extrapolate a little from the data we have, it would seem
|
||||
reasonable that the last remnants of kernel version 5 may be disappearing
|
||||
around May or June 2026. A lot of course depends on the upstream kernel release
|
||||
windows and the stability of the releases themselves. But barring any major
|
||||
upheavals in the kernel releases (of a magnitude like the removal of
|
||||
[bcachefs](https://en.wikipedia.org/wiki/Bcachefs)) or major stability issues,
|
||||
this seems a reasonable assumption to me.
|
||||
|
||||
## Appendix: Odds and Ends
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue