Recent comments posted to this site:

As of right now, the assistant has only secondary benefit over plain git-annex inasmuch there's a lot more code activity.

As soon as the assistant supports Android, I will use for syncing photos off of my phone and may start to use the assistant on my usual repos as a natural consequence.

Additionally, there's a subjective feeling of more bugs being reported. That may or may not be true, but as long as there's no Android port, I don't have an actual reason to "risk" it.

-- Richard

PS: This seems to be the first poll where you can place only one vote while it's the first where I really wanted to vote on two separate items.

Comment by https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U Thu Jan 26 19:33:52 2023

hi, check out http://puffingdev.com/live-gource-visually-display-your-svngit-activity-live/

You can use it to display LIVE updates of a git repo, nice to have on a screen in your company's common area for example.

Comment by https://www.google.com/accounts/o8/id?id=AItOawm9IQStaE1el95_9s77CgmJhxZwCwUeN9A Thu Jan 26 19:33:52 2023

What's the difference between source and unwanted ?

  • source (not copies=1) will keep files that have less than 1 copies, meaning zero copies, meaning no files.
  • unwanted will exclude all files.

Both gets to the same results, all files are moved elsewhere. Right?

Comment by http://mildred.fr/ Thu Jan 26 19:33:52 2023

Is there going to be an update of git-annex in debian squeeze-backports to a version that supports repository version 3? Thx

Comment by https://www.google.com/accounts/o8/id?id=AItOawla7u6eLKNYZ09Z7xwBffqLaXquMQC07fU Thu Jan 26 19:33:52 2023
Does this work for special remotes? Or only remotes that are full repositories?
Comment by mark Thu Jan 26 19:33:52 2023

You can also use git log --stat -S'SHA256E-...'

Comment by Lukey Thu Jan 26 19:33:52 2023

With systemd using --autostart --foreground either ignore foreground or quit immediatelly.

I managed to have the process stay alive with RemainAfterExit=on:

[Service]
User=%i
ExecStart=/usr/bin/git-annex assistant --autostart --foreground
ExecStop=/usr/bin/git-annex assistant --autostop
RemainAfterExit=on
Restart=on-failure
RestartSec=5

but git-annex processes does not maintain the --foreground option which is causing a lot of zombies in the long period (not totally clear why).

My current solution is to have a service for each annex repository and avoid --autosart but this is annoying because it require to pass the path as %I and wrap git-annex in bash script to get the repo owner as the user.

Comment by oberix Thu Jan 26 19:33:52 2023

Summary

Just to make it explicit: --known mode operates on the annex only. If trying to reinject a file that is stored in the regular git part of the repository, and therefore practically known, git-annex-reinject will consider it not known.

Context

I'm currently using git-annex reinject --known to tidy a pre-git-annex storage. It gets progressively near-emptied of big files, letting unknown files stand out in the deserted directory hierarchy.

Yet only actually annexed files will get removed.

In my case big files are pictures (NEF, JPG), and regular git files are xmp metadata files used by http://darktable.org/ to store processing parameters. So, all xmp files linger there, whether they were committed in git or not, needing separate handling.

How to detect if a file is known to regular git repository (not annex).

There must be a number of ways. I just hacked one:

HASH=$( git hash-object "$FILEPATH" )
if $( git cat-file -e "$HASH" )
then
        echo "Known $FILEPATH"
else
        echo "Unknown $FILEPATH"
fi

This can be wrapped into a helper function and used in a find | ... one-liner to remove any file already known to git.

Caveats

git cat-file will probably consider known any file actually stored within git objects, even if on an deleted branch or whatever situations where it is not reachable. As a result, removing files based on this test may well lose information, not immediately, but on some subsequent git gc.

Such caveat is not surprising, as regular git content and annexed content have differing "scopes"/lifetime.

Question

Joey, is there an alternative to git-annex-reinject --known that considers regular git content, too? Perhaps it's a pure git issue and therefore not something inside git-annex job?

A quick test of git-annex-import --clean-duplicates shows similar behavior.

Comment by https://launchpad.net/~stephane-gourichon-lpad Thu Jan 26 19:33:52 2023
Why is uninit so slow? Shouldn’t it simply consist of moving the files out of the annex and then delete the .git folder?
Comment by chocolate.camera Thu Jan 26 19:33:52 2023

When you run into weird no space left on device errors although there clearly is enough room on your disk while running git annex repair, it probably means that your /tmp is too small (You can verify with watch -n0 df -h or just df -h). You can instruct git annex repair to use a different directory for intermediate storage via the TMPDIR environment variable:

mkdir /path/to/dir/with/enough/space
TMPDIR=/path/to/dir/with/enough/space git annex repair
# Now that new dir is used and the 'no space left on device' error should disappear

Might be worth adding a note for TMPDIR to the git annex repair manpage, @joey?

Comment by nobodyinperson Thu Jan 26 19:33:52 2023