Add conclusion and data sourcing

This commit is contained in:
Marty Oehme 2025-10-08 21:15:22 +02:00
parent afa2f6ffb2
commit 77a3a65c4e
Signed by: Marty
GPG key ID: 4E535BC19C61886E

View file

@ -451,6 +451,17 @@ upheavals in the kernel releases (of a magnitude like the removal of
[bcachefs](https://en.wikipedia.org/wiki/Bcachefs)) or major stability issues,
this seems a reasonable assumption to me.
## Conclusion
That's it for the main look at the packages and kernel versions in use in the
Void Linux community, currently and in the past.
There are of course more observations to be made.
One that still interests me is the development of the dominant packages over time ---
were the top packages relatively static or did they evolve from others?
<!-- Another interesting analysis that comes to mind is the... [TODO:] -->
## Appendix: Odds and Ends
The above graphics are the main ones that I think could be useful, entertaining, or somewhere in between.
@ -606,29 +617,32 @@ is not called out in the main article. As it is --- an interesting fact, and,
where this a more rigorous investigation, perhaps worthy of taking into account
as biasing the result, but for our purposes not too bad.
## Outline
### The code and the data
- intro
- filesize
- unique installations reported from
- packages -> perhaps find new subcategories
- global
- relative (pkg/unique)
- top packages
- rare packages?
- install distribution
- packages per time unit (find clever title, e.g. 'accumulated packages')
- per year?
- weekday
- month of year (combine with weekday?)
- kernels
- overall kernel version installations
- kernels over time
All the data used in the previous sections originally comes from
<https://popcorn.voidlinux.org>. The collected data, in csv form, is available
from [this repository](https://git.martyoeh.me/datasci/ds-voidlinux-popcorn).
- misc
- missing days
- moved days
If you want to take a closer look at the functions creating the plots and
tables above, they are all available in [this repository]() in the
`notebooks/popcorn.py` file. It is the first project I have mostly written
using [marimo](https://github.com/marimo-team/marimo) instead of jupyter, and I
have to say I really enjoy its workflow.
- things we can't see (limitations)
- packages on offer in the repositories
- this could shed light on the bumps of users and relative package ownership
To figure out which function creates which plot, just look up the function name
in the relevant cell imported in the `popcorn.qmd` quarto file in the
repository root, and search for it in the marimo notebook file.
Feel free to use any files or parts of this analysis for your own purposes. All
my own content, including this analysis is released under
[CC BY-NC 4.0](http://creativecommons.org/licenses/by-nc/4.0/).
If you have any ideas of further analysis, don't hesitate to let me know.
If you spot any errors or there are other issues, of course also let me know.
I assume most people reading this will be very familiar with and using it
already, but if not and any of this piques your interest, feel free to also
take [Void Linux](https://voidlinux.org/) for a spin. Don't forget to install
and enable
[PopCorn](https://github.com/void-linux/void-packages/tree/master/srcpkgs/PopCorn)
so that you too can contribute to the future statistics above ;-)