Overhaul text cells
This commit is contained in:
parent
08737e1baa
commit
094aa34758
1 changed files with 46 additions and 30 deletions
76
popcorn.py
76
popcorn.py
|
|
@ -75,28 +75,26 @@ def _():
|
||||||
r"""
|
r"""
|
||||||
## Daily statistics file size
|
## Daily statistics file size
|
||||||
|
|
||||||
The simplest operation we can do is look at the overall file size for each
|
The simplest operation we can do is look at the overall file size for each of the daily
|
||||||
of the daily statistics files over time. The files consist of a long list
|
statistics files over time. The files consist of a long list of packages which have been checked
|
||||||
of packages which have been downloaded from the repositories that day,
|
from the repositories that day, along with the number of package instances. It also consists of
|
||||||
along with the number of downloads. It also consists of the same list
|
the same list separated by specifically installed versions of packages, so if somebody has
|
||||||
separated by specifically downloaded versions of packages, so if somebody
|
v0.9.1 and somebody else v0.9.3 instead this would count both packages separately.
|
||||||
downloads v0.9.1 and somebody else downloads v0.9.3 this would count both
|
|
||||||
downloads separately.
|
|
||||||
|
|
||||||
Another count is the number of different Kernels that have been used to
|
Another count is the number of different Kernels that have been used on that day, with their
|
||||||
download (or downloaded?) from the repositories.
|
exact kernel name including major version, minor version and any suffix.
|
||||||
|
|
||||||
These are the major things that will lead to size increases in the file,
|
These are the major things that will lead to size increases in the file, but not just for an
|
||||||
but not just for an increased amount of downloads --- we will get to those shortly.
|
increased amount of absolute users, packages or uploads --- we will get to those shortly.
|
||||||
|
|
||||||
No, an increase in file size here mainly suggests an increase in the
|
No, an increase in file size here mainly suggests an increase in the 'breadth' of files on offer
|
||||||
'breadth' of files on offer in the repository, whether that be a wider
|
in the repository, whether that be a wider variety of program versions or more different
|
||||||
variety of program versions or more different packages that people are
|
packages that people are interested in, and those that the community chooses to use.
|
||||||
interested in.
|
|
||||||
|
So while the overall amount of packages gives a general estimate of the interest in the
|
||||||
|
distribution, this can show a more 'distributor'-aligned view on how many different aisles of
|
||||||
|
the buffet people are eating from.
|
||||||
|
|
||||||
So while the overall amount of downloads gives a general estimate of the
|
|
||||||
interest in the distribution, this can show a more 'distributor'-aligned
|
|
||||||
view on how many different aisles of the buffet people are eating from.
|
|
||||||
"""
|
"""
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
@ -122,13 +120,18 @@ def _():
|
||||||
mo.md(
|
mo.md(
|
||||||
r"""
|
r"""
|
||||||
|
|
||||||
As we can see, the difference over time is massive. Especially early on,
|
As we can see, the difference over time is massive. Especially early on, between 2019 and the
|
||||||
between 2019 and the start of 2021, the amount of different stuff
|
start of 2021, the amount of different packages and package versions used grew rapidly, with the
|
||||||
downloaded grew rapidly, with the pace picking up again starting 2023.
|
pace picking up once again starting 2023.
|
||||||
|
|
||||||
There are a few outliers with a size of 0 kB, which we will remove from the
|
There are a few outlier days with a size of 0 kB, which we will remove from the data. In all
|
||||||
data. There are also a few days where the modification date of the file
|
likelihood, those days were not reported correctly or there was some kind of issue on the
|
||||||
does not correspond to the represented statistical date.
|
backend so the stats for those days are lost.
|
||||||
|
|
||||||
|
There are also a few days where the modification date of the file does not correspond to the
|
||||||
|
represented statistical date but those are kept. This rather points to certain times when the
|
||||||
|
files have been moved on the backend, or recreated externally but does not mean the data are
|
||||||
|
bad.
|
||||||
|
|
||||||
"""
|
"""
|
||||||
)
|
)
|
||||||
|
|
@ -159,14 +162,15 @@ def _():
|
||||||
def _():
|
def _():
|
||||||
mo.md(
|
mo.md(
|
||||||
r"""
|
r"""
|
||||||
## Download statistics
|
## Package statistics
|
||||||
|
|
||||||
Now that we have an idea of how the overall interest in the distribution
|
Now that we have an idea of how the overall interest in the distribution has changed over time,
|
||||||
has changed over time, let's look at the actual download statistics.
|
let's look at the actual package statistics.
|
||||||
|
|
||||||
|
The popcorn files contain two main pieces of information: the number of installs per package
|
||||||
|
(e.g. how many people have rsync installed) and the number of unique installs (i.e. unique
|
||||||
|
machines providing statistics). We will look at both of these in turn.
|
||||||
|
|
||||||
The popcorn files contain two main pieces of information: the number of
|
|
||||||
unique installs (i.e. unique machines downloading packages) and the number
|
|
||||||
of downloads per package. We will look at both of these in turn.
|
|
||||||
"""
|
"""
|
||||||
)
|
)
|
||||||
return
|
return
|
||||||
|
|
@ -195,6 +199,18 @@ def _(df_pkg_lazy: pl.LazyFrame):
|
||||||
return
|
return
|
||||||
|
|
||||||
|
|
||||||
|
@app.cell(hide_code=True)
|
||||||
|
def _():
|
||||||
|
mo.md(
|
||||||
|
r"""
|
||||||
|
|
||||||
|
The amount of packages installed on all machines increases strongly over time.
|
||||||
|
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
return
|
||||||
|
|
||||||
|
|
||||||
@app.cell
|
@app.cell
|
||||||
def _(df_pkg_lazy: pl.LazyFrame):
|
def _(df_pkg_lazy: pl.LazyFrame):
|
||||||
def _():
|
def _():
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue