diff --git a/popcorn.qmd b/popcorn.qmd index dc62b95..c8c1832 100644 --- a/popcorn.qmd +++ b/popcorn.qmd @@ -130,37 +130,50 @@ up again, if at a more mellow pace. Now that we have an idea of how the overall reported sizes in the distribution have changed over time, let's focus on the actual package statistics. -The popcorn files contain two main pieces of information: the number of -installs per package (e.g. how many people have `rsync` installed) and the -number of unique installs (i.e. how many people provide their statistics). We -will look at both of these in turn. +The popcorn files contain two pieces of information we're interested in: the +number of installs per package (e.g. how many people have `rsync` installed) +and the number of unique installs (i.e. how many people provide their +statistics). We will look at both of these in turn. ```{python} from notebooks.popcorn import plt_weekly_packages pplot(plt_weekly_packages) ``` -The number of packages overall strongly rises until early 2021, -when it stagnates a little before rising more slowly again afterwards. -The pattern strongly mirrors the curve we saw before for the daily filesize. +The number of packages installed overall strongly rises until early 2021, when +it stagnates a little before rising more slowly again afterwards. The pattern +strongly mirrors the curve we saw before for the daily filesize. There is a +curious dip visible in the data in early 2021 which seems to say fewer packages +were installed during most of 2022 compared to 2021. -Turning to the daily unique uploads, we can see a similar pattern, though even -more strongly pronounced. +The graph above traces the _absolute_ number of package installations for each +week during the data collection period. That means, a simple sum of the number +of all currently installed packages for each day. + +So to figure out one possible reason for the dip, let's turn to the daily +unique uploads, in we can see a similar pattern, though even more strongly +pronounced. ```{python} from notebooks.popcorn import plt_unique_installs pplot(plt_unique_installs) ``` -Unique installations rise sharply until early 2020. Then they not just stagnate -but shrink for the next three years. It is only early 2023 when the numbers -recover and begin rising again slowly. +This graphs similarly shows the _absolute_ number, this time of unique Void +Linux installations counted for each day. Of course, these are only the +_reported_ installations (since we don't know about unreported), as can be seen +in the overall small number of between 100 and 120 installations. -We also have one day on 05 July 2024 which has significantly fewer unique -uploads (36 only) than all the other days around it. I have no clue if -something happened to data collection or everybody collectively decided to -leave their PC offline just for that day, but the numbers are back to normal -the day after.[^independence-day] +Unique installations rise sharply until early 2020. Then they not just stagnate +but shrink for the next three years. It is only from early 2023 onwards when +the numbers recover and begin rising again slowly. + +We also have one day on 05 July 2024 which has _significantly_ fewer unique +uploads (36 only) than all the other days around it. It reflects in the graph +as a single week dipping down in 2024, but would be look more egregious on a +daily accumulation. I have no clue if something happened to data collection or +everybody collectively decided to leave their PC offline just for that day, but +the numbers are back to normal the day after.[^independence-day] [^independence-day]: I suppose one interpretation would be people taking their @@ -176,17 +189,23 @@ statistics the absolute number of package installations will be somewhat reduced as a result, unless for some reason the remaining people all of a sudden start having many more packages installed. -Let's check that out next, by actually looking at the installed packages _per -user_ for each day. +So this could be one reason for the dip in reported package ownership. The +decrease in daily reports maps relatively cleanly onto the dip in absolute +packages, and makes sense from a conceptual standpoint: fewer reports mean +fewer overall reported packages. + +Next, let's verify that hunch by actually looking at the installed packages +_per user_ for each day. ```{python} from notebooks.popcorn import plt_pkg_relative pplot(plt_pkg_relative) ``` -Combining both stats to look at the installed packages at a more individual -level per user, we see this confirmed. There is no similarly strong dip for the -relative package ownership as there was for the absolute package numbers. +Combining both previous stats to look at the installed packages at a more +individual level per user, I think we see our hunch confirmed. There is no +similarly strong dip for the relative package ownership as there was for the +absolute package numbers. Indeed, with the exception of a small more rapid increase in individual package ownership in 2019, we see a much more stable increase in per-user packages than @@ -210,17 +229,30 @@ couple users having a much larger package ownership than everybody else. This may signify new users checking out Void Linux and downloading a large variety of packages in the process. - +Perhaps a similar pattern is visible in the higher number of packages per user +in 2019. With even fewer unique daily reports (between 20 and 60 for the year), +single users' package count differences reflect much more drastically on this +graph. So, one possibility for the rapid decrease followed by a more linear +increase is the 'balancing' of package ownerships across the (wider) reported +community. -For a breakdown of the absolute numbers of packages on systems by weekday and -month of the year instead of over time, see the Appendix below. +Want to know how many packages you currently have installed? Find out with a +quick `xbps-query -m | wc -l` to count all your explicitly installed packages. +I currently have 234, so I'm below average for this cohort (indeed, I am much +more of an average 2021 Void Linux kid, it appears). + + +For an additional breakdown of the absolute numbers of packages on systems by +weekday and month of the year instead of over time, have a look at the Appendix +below. An interesting trend is visible toward the end of the timeline window, with a -rapid decline in package numbers per user. It is too early for to clearly see -if this is just variability or an actual trend in the data. +rapid decline in package numbers per user starting in early 2025. It is too +early for to clearly see if this is just variability or an actual trend in the +data, but it is very interesting to see. -Beyond pure installation numbers, let's take a look at the actual top-installed -packages on users' systems. +Beyond pure installation numbers, let's also take a look at the actual +packages which take the top-installed spots on users' systems.