[V2,1/2] scripts/cve: Avoid to do a complete clone of cve git repository

Message ID 20240903085745.2594893-1-michael@amarulasolutions.com
State New
Headers show
Series
  • [V2,1/2] scripts/cve: Avoid to do a complete clone of cve git repository
Related show

Commit Message

Michael Trimarchi Sept. 3, 2024, 8:57 a.m. UTC
Just a simple clone and pull with --depth 1 should be enough to parse the
cve and generate the pkg-stats report.

From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
The download size does change: from 983.55MiB down to 270.78MiB.
it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).

Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
---
V1->V2:
    - Add statistics from Yann E. Morin
    - Use git pull --depth 1 for update the repo

---
 support/scripts/cve.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

'Krzysztof Kozlowski' via Amarula Linux Sept. 3, 2024, 6:50 p.m. UTC | #1
On Tue,  3 Sep 2024 10:57:44 +0200
Michael Trimarchi <michael@amarulasolutions.com> wrote:

> Just a simple clone and pull with --depth 1 should be enough to parse the
> cve and generate the pkg-stats report.
> 
> From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
> The download size does change: from 983.55MiB down to 270.78MiB.
> it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).
> 
> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> ---
> V1->V2:
>     - Add statistics from Yann E. Morin
>     - Use git pull --depth 1 for update the repo

Applied to next, thanks!

Thomas
Yann E. MORIN Sept. 3, 2024, 7:24 p.m. UTC | #2
Michael, All,

On 2024-09-03 20:50 +0200, Thomas Petazzoni via buildroot spake thusly:
> On Tue,  3 Sep 2024 10:57:44 +0200
> Michael Trimarchi <michael@amarulasolutions.com> wrote:
> 
> > Just a simple clone and pull with --depth 1 should be enough to parse the
> > cve and generate the pkg-stats report.
> > 
> > From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
> > The download size does change: from 983.55MiB down to 270.78MiB.
> > it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).
> > 
> > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> > ---
> > V1->V2:
> >     - Add statistics from Yann E. Morin
> >     - Use git pull --depth 1 for update the repo
> 
> Applied to next, thanks!

I don't understand: Michael said in the first iteration that we should
drop the package:
    https://lore.kernel.org/buildroot/CAOf5uw=m4LOk97OT1dTP=2-uP6QZ1WQyHfKYSQmWCirDaxXvgQ@mail.gmail.com/

Only the first clone is slow, the following calls will just pull (mostly
nothing most of the time), so the optimisation is not really worth it.

Also, in the download backend for git, we stopped doing shallow clone
because they were causing issues (but might not be applicable here).

Anyway, too late, that's been applied...

Regards,
Yann E. MORIN.
Michael Trimarchi Sept. 3, 2024, 7:34 p.m. UTC | #3
Hi Yann

On Tue, Sep 3, 2024 at 9:24 PM Yann E. MORIN <yann.morin.1998@free.fr> wrote:
>
> Michael, All,
>
> On 2024-09-03 20:50 +0200, Thomas Petazzoni via buildroot spake thusly:
> > On Tue,  3 Sep 2024 10:57:44 +0200
> > Michael Trimarchi <michael@amarulasolutions.com> wrote:
> >
> > > Just a simple clone and pull with --depth 1 should be enough to parse the
> > > cve and generate the pkg-stats report.
> > >
> > > From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
> > > The download size does change: from 983.55MiB down to 270.78MiB.
> > > it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).
> > >
> > > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> > > ---
> > > V1->V2:
> > >     - Add statistics from Yann E. Morin
> > >     - Use git pull --depth 1 for update the repo
> >
> > Applied to next, thanks!
>
> I don't understand: Michael said in the first iteration that we should
> drop the package:
>     https://lore.kernel.org/buildroot/CAOf5uw=m4LOk97OT1dTP=2-uP6QZ1WQyHfKYSQmWCirDaxXvgQ@mail.gmail.com/
>
> Only the first clone is slow, the following calls will just pull (mostly
> nothing most of the time), so the optimisation is not really worth it.
>
> Also, in the download backend for git, we stopped doing shallow clone
> because they were causing issues (but might not be applicable here).
>

I have reposted it, fixing the pull. Yann if you really don't like it,
people can drop, revert

Michael

> Anyway, too late, that's been applied...
>
> Regards,
> Yann E. MORIN.
>
> --
> .-----------------.--------------------.------------------.--------------------.
> |  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
> | +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
> | +33 561 099 427 `------------.-------:  X  AGAINST      |  \e/  There is no  |
> | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
> '------------------------------^-------^------------------^--------------------'
'Krzysztof Kozlowski' via Amarula Linux Sept. 12, 2024, 10:44 a.m. UTC | #4
Hello Michael,

On Tue,  3 Sep 2024 10:57:44 +0200
Michael Trimarchi <michael@amarulasolutions.com> wrote:

> Just a simple clone and pull with --depth 1 should be enough to parse the
> cve and generate the pkg-stats report.
> 
> From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
> The download size does change: from 983.55MiB down to 270.78MiB.
> it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).
> 
> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> ---
> V1->V2:
>     - Add statistics from Yann E. Morin
>     - Use git pull --depth 1 for update the repo

I am sorry, but I had to revert this commit... as it doesn't work:

Updating from https://github.com/fkie-cad/nvd-json-data-feeds/
Traceback (most recent call last):
  File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 1346, in <module>
    __main__()
  File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 1335, in __main__
    check_package_cves(args.nvd_path, packages)
  File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 660, in check_package_cves
    for cve in cvecheck.CVE.read_nvd_dir(nvd_path):
  File "/home/buildroot/buildroot-stats/support/scripts/cve.py", line 105, in read_nvd_dir
    CVE.download_nvd(nvd_git_dir)
  File "/home/buildroot/buildroot-stats/support/scripts/cve.py", line 74, in download_nvd
    subprocess.check_call(
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'pull', '--depth', '1']' returned non-zero exit status 128.

if I go to the machine in question:

buildroot@buildroot:~/nvd/git$ git pull --depth 1
remote: Enumerating objects: 164, done.
remote: Counting objects: 100% (140/140), done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 63 (delta 54), reused 44 (delta 36), pack-reused 0 (from 0)
Unpacking objects: 100% (63/63), 18.15 KiB | 26.00 KiB/s, done.
From https://github.com/fkie-cad/nvd-json-data-feeds
 + 51b091a6a4...7b69235976 main       -> origin/main  (forced update)
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint: 
hint:   git config pull.rebase false  # merge
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint: 
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.

I had already done a git reset --hard, and see if it was just a
one-time issue, but nope. So there's something that doesn't work here.

In case that matters, this machine has:

$ git --version
git version 2.39.2

Best regards,

Thomas Petazzoni
Michael Trimarchi Sept. 12, 2024, 10:48 a.m. UTC | #5
Hi Thomas

On Thu, Sep 12, 2024 at 12:44 PM Thomas Petazzoni
<thomas.petazzoni@bootlin.com> wrote:
>
> Hello Michael,
>
> On Tue,  3 Sep 2024 10:57:44 +0200
> Michael Trimarchi <michael@amarulasolutions.com> wrote:
>
> > Just a simple clone and pull with --depth 1 should be enough to parse the
> > cve and generate the pkg-stats report.
> >
> > From a full clone and a depth-1 clone, and the size delta is 2.9GiB vs. 2.2GiB.
> > The download size does change: from 983.55MiB down to 270.78MiB.
> > it's a net time win too: 2m17s vs 1min7s (on a 100Mbps link).
> >
> > Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
> > ---
> > V1->V2:
> >     - Add statistics from Yann E. Morin
> >     - Use git pull --depth 1 for update the repo
>
> I am sorry, but I had to revert this commit... as it doesn't work:
>
> Updating from https://github.com/fkie-cad/nvd-json-data-feeds/
> Traceback (most recent call last):
>   File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 1346, in <module>
>     __main__()
>   File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 1335, in __main__
>     check_package_cves(args.nvd_path, packages)
>   File "/home/buildroot/buildroot-stats/./support/scripts/pkg-stats", line 660, in check_package_cves
>     for cve in cvecheck.CVE.read_nvd_dir(nvd_path):
>   File "/home/buildroot/buildroot-stats/support/scripts/cve.py", line 105, in read_nvd_dir
>     CVE.download_nvd(nvd_git_dir)
>   File "/home/buildroot/buildroot-stats/support/scripts/cve.py", line 74, in download_nvd
>     subprocess.check_call(
>   File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
>     raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['git', 'pull', '--depth', '1']' returned non-zero exit status 128.
>
> if I go to the machine in question:
>
> buildroot@buildroot:~/nvd/git$ git pull --depth 1
> remote: Enumerating objects: 164, done.
> remote: Counting objects: 100% (140/140), done.
> remote: Compressing objects: 100% (27/27), done.
> remote: Total 63 (delta 54), reused 44 (delta 36), pack-reused 0 (from 0)
> Unpacking objects: 100% (63/63), 18.15 KiB | 26.00 KiB/s, done.
> From https://github.com/fkie-cad/nvd-json-data-feeds
>  + 51b091a6a4...7b69235976 main       -> origin/main  (forced update)
> hint: You have divergent branches and need to specify how to reconcile them.
> hint: You can do so by running one of the following commands sometime before
> hint: your next pull:
> hint:
> hint:   git config pull.rebase false  # merge
> hint:   git config pull.rebase true   # rebase
> hint:   git config pull.ff only       # fast-forward only
> hint:
> hint: You can replace "git config" with "git config --global" to set a default
> hint: preference for all repositories. You can also pass --rebase, --no-rebase,
> hint: or --ff-only on the command line to override the configured default per
> hint: invocation.
> fatal: Need to specify how to reconcile divergent branches.
>
> I had already done a git reset --hard, and see if it was just a
> one-time issue, but nope. So there's something that doesn't work here.
>
> In case that matters, this machine has:
>
> $ git --version
> git version 2.39.2
>

Yes sorry, I always use tools that are the latest,  git version 2.43.0

Make sense. I will create a better setup for me


Michael

> Best regards,
>
> Thomas Petazzoni
> --
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

Patch

diff --git a/support/scripts/cve.py b/support/scripts/cve.py
index e25825581e..dcb3a63925 100755
--- a/support/scripts/cve.py
+++ b/support/scripts/cve.py
@@ -72,7 +72,7 @@  class CVE:
         print(f"Updating from {NVD_BASE_URL}")
         if os.path.exists(nvd_git_dir):
             subprocess.check_call(
-                ["git", "pull"],
+                ["git", "pull", "--depth", "1"],
                 cwd=nvd_git_dir,
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL,
@@ -82,7 +82,7 @@  class CVE:
             # happily clones into an empty directory.
             os.makedirs(nvd_git_dir)
             subprocess.check_call(
-                ["git", "clone", NVD_BASE_URL, nvd_git_dir],
+                ["git", "clone", "--depth", "1", NVD_BASE_URL, nvd_git_dir],
                 stdout=subprocess.DEVNULL,
                 stderr=subprocess.DEVNULL,
             )