This updates the repo index/file view endpoints so annex files match the way
LFS files are rendered, making annexed files accessible via the web instead of
being black boxes only accessible by git clone.
This mostly just duplicates the existing LFS logic. It doesn't try to combine itself
with the existing logic, to make merging with upstream easier. If upstream ever
decides to accept, I would like to try to merge the redundant logic.
The one bit that doesn't directly copy LFS is my choice to hide annex-symlinks.
LFS files are always _pointer files_ and therefore always render with the "file"
icon and no special label, but annex files come in two flavours: symlinks or
pointer files. I've conflated both kinds to try to give a consistent experience.
The tests in here ensure the correct download link (/media, from the last PR)
renders in both the toolbar and, if a binary file (like most annexed files will be),
in the main pane, but it also adds quite a bit of code to make sure text files
that happen to be annexed are dug out and rendered inline like LFS files are.
This moves the `annexObjectPath()` helper out of the tests and into a
dedicated sub-package as `annex.ContentLocation()`, and expands it with
`.Pointer()` (which validates using `git annex examinekey`),
`.IsAnnexed()` and `.Content()` to make it a more useful module.
The tests retain their own wrapper version of `ContentLocation()`
because I tried to follow close to the API modules/lfs uses, which in
terms of abstract `git.Blob` and `git.TreeEntry` objects, not in terms
of `repoPath string`s which are more convenient for the tests.
This makes HTTP symmetric with SSH clone URLs.
This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.
Previously, to access "open access" data shared this way,
users would need to:
1. Create an account on gitea.example.com
2. Create ssh keys
3. Upload ssh keys (and make sure to find and upload the correct file)
4. `git clone git@gitea.example.com:user/dataset.git`
5. `cd dataset`
6. `git annex get`
This cuts that down to just the last three steps:
1. `git clone https://gitea.example.com/user/dataset.git`
2. `cd dataset`
3. `git annex get`
This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.
Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See https://github.com/neuropoly/gitea/issues/7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.
This is not a major loss since no one wants uploading to be anonymous anyway.
To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).
This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/
Fixes https://github.com/neuropoly/gitea/issues/3
Co-authored-by: Mathieu Guay-Paquet <mathieu.guaypaquet@polymtl.ca>
Fixes https://github.com/neuropoly/gitea/issues/11
Tests:
* `git annex init`
* `git annex copy --from origin`
* `git annex copy --to origin`
over:
* ssh
for:
* the owner
* a collaborator
* a read-only collaborator
* a stranger
in a
* public repo
* private repo
And then confirms:
* Deletion of the remote repo (to ensure lockdown isn't messing with us: https://git-annex.branchable.com/internals/lockdown/#comment-0cc5225dc5abe8eddeb843bfd2fdc382)
------
To support all this:
* Add util.FileCmp()
* Patch withKeyFile() so it can be nested in other copies of itself
-------
Many thanks to Mathieu for giving style tips and catching several bugs,
including a subtle one in util.filecmp() which neutered it.
Co-authored-by: Mathieu Guay-Paquet <mathieu.guay-paquet@polymtl.ca>
[git-annex](https://git-annex.branchable.com/) is a more complicated cousin to
git-lfs, storing large files in an optional-download side content. Unlike lfs,
it allows mixing and matching storage remotes, so the content remote(s) doesn't
need to be on the same server as the git remote, making it feasible to scatter
a collection across cloud storage, old harddrives, or anywhere else storage can
be scavenged. Since this can get complicated, fast, it has a content-tracking
database (`git annex whereis`) to help find everything later.
The use-case we imagine for including it in Gitea is just the simple case, where
we're primarily emulating git-lfs: each repo has its large content at the same URL.
Our motivation is so we can self-host https://www.datalad.org/ datasets, which
currently are only hostable by fragilely scrounging together cloud storage --
and having to manage all the credentials associated with all the pieces -- or at
https://openneuro.org which is fragile in its own ways.
Supporting git-annex also allows multiple Gitea instance to be annex remotes for
each other, mirroring the content or otherwise collaborating the split up the
hosting costs.
Enabling
--------
TODO
HTTP
----
TODO
Permission Checking
-------------------
This tweaks the API in routers/private/serv.go to expose the calling user's
computed permission, instead of just returning HTTP 403.
This doesn't fit in super well. It's the opposite from how the git-lfs support is
done, where there's a complete list of possible subcommands and their matching
permission levels, and then the API compares the requested with the actual level
and returns HTTP 403 if the check fails.
But it's necessary. The main git-annex verbs, 'git-annex-shell configlist' and
'git-annex-shell p2pstdio' are both either read-only or read-write operations,
depending on the state on disk on either end of the connection and what the user
asked it to ask for, with no way to know before git-annex examines the situation.
So tell the level via GIT_ANNEX_READONLY and trust it to handle itself.
In the older Gogs version, the permission was directly read in cmd/serv.go:
```
mode, err = db.UserAccessMode(user.ID, repo)
```
- 966e925cf3/internal/cmd/serv.go (L334)
but in Gitea permission enforcement has been centralized in the API layer.
(perhaps so the cmd layer can avoid making direct DB connections?)
Deletion
--------
git-annex has this "lockdown" feature where it tries
really quite very hard to prevent you deleting its
data, to the point that even an rm -rf won't do it:
each file in annex/objects/ is nested inside a
folder with read-only permissions.
The recommended workaround is to run chmod -R +w when
you're sure you actually want to delete a repo. See
https://git-annex.branchable.com/internals/lockdown
So we edit util.RemoveAll() to do just that, so now
it's `chmod -R +w && rm -rf` instead of just `rm -rf`.
Backport #28797 by @lunny
Fix#28694
Generally, `refname:short` should be equal to `refname:lstrip=2` except
`core.warnAmbiguousRefs is used to select the strict abbreviation mode.`
ref:
https://git-scm.com/docs/git-for-each-ref#Documentation/git-for-each-ref.txt-refname
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
(cherry picked from commit 4746291b0863eaf9c629c944bd48295ca2f69d2f)
Backport #26745Fixes#26548
This PR refactors the rendering of markup links. The old code uses
`strings.Replace` to change some urls while the new code uses more
context to decide which link should be generated.
The added tests should ensure the same output for the old and new
behaviour (besides the bug).
We may need to refactor the rendering a bit more to make it clear how
the different helper methods render the input string. There are lots of
options (resolve links / images / mentions / git hashes / emojis / ...)
but you don't really know what helper uses which options. For example,
we currently support images in the user description which should not be
allowed I think:
<details>
<summary>Profile</summary>
https://try.gitea.io/KN4CK3R

</details>
(cherry picked from commit 022552d5b6adc792d3cd16df7de6e52cb7b41a72)
Backport #28796 by @wxiaoguang
`resp != nil` doesn't mean the request really succeeded. Add a comment
for requestJSONResp to clarify the behavior.
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit cbf366643bfbc89a1fbc8a149e31abf19c60d6a9)
Backport #28708 by wxiaoguang
Regression of #27723Fix#28705
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit 7f833d8f71b71b0e983ed979c5ad8088aec7fb7d)
- The current architecture is inherently insecure, because you can
construct the 'secret' cookie value with values that are available in
the database. Thus provides zero protection when a database is
dumped/leaked.
- This patch implements a new architecture that's inspired from: [Paragonie Initiative](https://paragonie.com/blog/2015/04/secure-authentication-php-with-long-term-persistence#secure-remember-me-cookies).
- Integration testing is added to ensure the new mechanism works.
- Removes a setting, because it's not used anymore.
(cherry picked from commit eff097448b1ebd2a280fcdd55d10b1f6081e9ccd)
[GITEA] rework long-term authentication (squash) add migration
Reminder: the migration is run via integration tests as explained
in the commit "[DB] run all Forgejo migrations in integration tests"
(cherry picked from commit 4accf7443c1c59b4d2e7787d6a6c602d725da403)
(cherry picked from commit 99d06e344ebc3b50bafb2ac4473dd95f057d1ddc)
(cherry picked from commit d8bc98a8f021d381bf72790ad246f923ac983ad4)
(cherry picked from commit 6404845df9a63802fff4c5bd6cfe1e390076e7f0)
(cherry picked from commit 72bdd4f3b9f6509d1ff3f10ecb12c621a932ed30)
(cherry picked from commit 4b01bb0ce812b6c59414ff53fed728563d8bc9cc)
(cherry picked from commit c26ac318162b2cad6ff1ae54e2d8f47a4e4fe7c2)
(cherry picked from commit 8d2dab94a6)
Conflicts:
routers/web/auth/auth.go
https://codeberg.org/forgejo/forgejo/issues/2158
- https://github.com/NYTimes/gziphandler doesn't seems to be maintained
anymore and Forgejo already includes
https://github.com/klauspost/compress which provides a maintained and
faster gzip handler fork.
- Enables Jitter to prevent BREACH attacks, as this *seems* to be
possible in the context of Forgejo.
(cherry picked from commit cc2847241d82001babd8d40c87d03169f21c14cd)
(cherry picked from commit 99ba56a8761dd08e08d9499cab2ded1a6b7b970f)
Conflicts:
go.sum
https://codeberg.org/forgejo/forgejo/pulls/1581
(cherry picked from commit 711638193daa2311e2ead6249a47dcec47b4e335)
(cherry picked from commit 9c12a37fde6fa84414bf332ff4a066facdb92d38)
(cherry picked from commit 91191aaaedaf999209695e2c6ca4fb256b396686)
(cherry picked from commit 72be417f844713265a94ced6951f8f4b81d0ab1a)
(cherry picked from commit 98497c84da205ec59079e42274aa61199444f7cd)
(cherry picked from commit fba042adb5c1abcbd8eee6b5a4f735ccb2a5e394)
(cherry picked from commit dd2414f226)
Conflicts:
routers/web/web.go
https://codeberg.org/forgejo/forgejo/issues/2016
Backport #28587, the only conflict is the test file.
The CORS code has been unmaintained for long time, and the behavior is
not correct.
This PR tries to improve it. The key point is written as comment in
code. And add more tests.
Fix#28515Fix#27642Fix#17098
(cherry picked from commit 7a2786ca6cd84633784a2c9986da65a9c4d79c78)
do not reuse the payload of the event that triggered the creation of
the scheduled event. Create a new one instead that contains no other
information than the event name in the action field ("schedule").
(cherry picked from commit 0b40ca1ea5e6b704bcb6c0d370a21f633facc7d6)
handleSchedules() is called every time an event is received and will
check the content of the main branch to (re)create scheduled events.
There is no reason why intput.Event will be relevant when the schedule
workflow runs.
(cherry picked from commit 9a712bb276f2103cd7bccc4bb07b6cc669537e38)
Backport #28491 by @appleboy
- Modify the `Password` field in `CreateUserOption` struct to remove the
`Required` tag
- Update the `v1_json.tmpl` template to include the `email` field and
remove the `password` field
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
(cherry picked from commit 411310d698e86bd639b31f2f5a8b856365b4590f)
Backport #28454 (the only conflict is caused by some comments)
* Close#24483
* Close#28123
* Close#23682
* Close#23149
(cherry picked from commit a3f403f438e7f5b5dca3a5042fae8e68a896b1e7)
Conflicts:
modules/setting/ui.go
trivial context conflict
Backport #28487 by @earl-warren
- When a repository is orphaned and has objects stored in any of the
storages such as repository avatar or attachments the delete function
would error, because the storage module wasn't initalized.
- Add code to initialize the storage module.
Refs: https://codeberg.org/forgejo/forgejo/pulls/1954
Co-authored-by: Earl Warren <109468362+earl-warren@users.noreply.github.com>
Co-authored-by: Gusted <postmaster@gusted.xyz>
(cherry picked from commit 8ee1ed877b0207d9d8733ac32270325c54659909)
It will determine how anchors are created and will break existing
links otherwise.
Adapted from Revert "Make `user-content-* ` consistent with github (#26388)
It shows warnings although the setting is not set, this will surely be
fixed later but there is no sense in spaming the users right now. This
revert can be discarded when another fix lands in v1.21.
su -c "forgejo admin user generate-access-token -u root --raw --scopes 'all,sudo'" git
2023/12/12 15:54:45 .../setting/security.go:166:loadSecurityFrom() [W] Enabling Query API Auth tokens is not recommended. DISABLE_QUERY_AUTH_TOKEN will default to true in gitea 1.23 and will be removed in gitea 1.24.
This reverts commit 0e3a5abb69.
Conflicts:
routers/api/v1/api.go
Backport #28390 by @jackHay22
## Changes
- Add deprecation warning to `Token` and `AccessToken` authentication
methods in swagger.
- Add deprecation warning header to API response. Example:
```
HTTP/1.1 200 OK
...
Warning: token and access_token API authentication is deprecated
...
```
- Add setting `DISABLE_QUERY_AUTH_TOKEN` to reject query string auth
tokens entirely. Default is `false`
## Next steps
- `DISABLE_QUERY_AUTH_TOKEN` should be true in a subsequent release and
the methods should be removed in swagger
- `DISABLE_QUERY_AUTH_TOKEN` should be removed and the implementation of
the auth methods in question should be removed
## Open questions
- Should there be further changes to the swagger documentation?
Deprecation is not yet supported for security definitions (coming in
[OpenAPI Spec version
3.2.0](https://github.com/OAI/OpenAPI-Specification/issues/2506))
- Should the API router logger sanitize urls that use `token` or
`access_token`? (This is obviously an insufficient solution on its own)
Co-authored-by: Jack Hay <jack@allspice.io>
Co-authored-by: delvh <dev.lh@web.de>
(cherry picked from commit f144521aea0d7a08b9bd5f17e49bae4021bd7a45)
Backport #28348 by @AdamMajer
nogogit GetBranchNames() lists branches sorted in reverse commit date
order. On the other hand the gogit implementation doesn't apply any
ordering resulting in unpredictable behaviour. In my case, the unit
tests requiring particular order fail
repo_branch_test.go:24:
Error Trace:
./gitea/modules/git/repo_branch_test.go:24
Error: elements differ
extra elements in list A:
([]interface {}) (len=1) {
(string) (len=6) "master"
}
extra elements in list B:
([]interface {}) (len=1) {
(string) (len=7) "branch1"
}
listA:
([]string) (len=2) {
(string) (len=6) "master",
(string) (len=7) "branch2"
}
listB:
([]string) (len=2) {
(string) (len=7) "branch1",
(string) (len=7) "branch2"
}
Test: TestRepository_GetBranches
To fix this, we sort branches based on their commit date in gogit
implementation.
Fixes: #28318
Co-authored-by: Adam Majer <amajer@suse.de>
(cherry picked from commit 272ae03341561ad51228fc75bd12ca3180504100)
Backport #28373 by @capvor
In the documents, the `[attachment] MAX_SIZE` default value should be 4.
Reference the source code `modules/setting/attachment.go` line 29.
Co-authored-by: capvor <capvor@sina.com>
(cherry picked from commit 8f2805f7574da3382e7e2c5bc45641245e920cbc)
Backport #28356 by @darrinsmart
The summary string ends up in the database, and (at least) MySQL &
PostgreSQL require valid UTF8 strings.
Fixes#28178
Co-authored-by: darrinsmart <darrin@djs.to>
Co-authored-by: Darrin Smart <darrin@filmlight.ltd.uk>
(cherry picked from commit fef34790bb73b12b8b11daa45f17bd30fe30f4f0)
Backport #28306 by @KN4CK3R
Fixes#28280
Reads the `previous` info from the `git blame` output instead of
calculating it afterwards.
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
(cherry picked from commit e15fe853350020cc9fbaf9f9ae6b39dc88942f39)
Backport #28276
The git command may operate the git directory (add/remove) files in any
time.
So when the code iterates the directory, some files may disappear during
the "walk". All "IsNotExist" errors should be ignored.
(cherry picked from commit 4f5122a7fed227ddcc98b76be8dac3945582f91a)
Backport #28200
gitea doctor failed at checking and fixing 'delete-orphaned-repos',
because table name 'user' needs quoting to be correctly recognized by at
least PostgreSQL.
fixes#28199
(cherry picked from commit 7cae4dfc0048db02bef34ff1b8726e82b052fb85)
Backport #28184Fix#25473
Although there was `m.Post("/login/oauth/access_token", CorsHandler()...`,
it never really worked, because it still lacks the "OPTIONS" handler.
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit 23838c2c2eaf596bffd5331406be99edc264883c)
Backport #28100 by @lng2020
https://github.com/go-gitea/gitea/pull/27946 forgets to change them in
code. Sorry about that.
Co-authored-by: Nanguan Lin <70063547+lng2020@users.noreply.github.com>
(cherry picked from commit 56bedf2bccc7b9a98b94d1d5016231e7b68cd75d)
Backport #28085 by @wxiaoguang
Fix#28083 and fix the tests
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit f7567f798d0d9dd3379051121b8b89abf09f938f)
- Backport https://codeberg.org/forgejo/forgejo/pulls/1742
- While looking trough the logs for unrelated things I noticed errors
for directory size calculations in `pushUpdates` that were being caused
by a race condition in which git was making temporary file,
`filepath.WalkDir` noticed that but by the time the second lstat
came(`info.Info()`) it was already gone and it would error.
- Ignore temporary files created by Git.
- There are other cases but much much more rarer and not trivial to detect.
Examples:
...s/repository/push.go:96:pushUpdates() [E] Failed to update size for repository: updateSize: lstat [...]/objects/info/commit-graphs/tmp_graph_Wcy9kR: no such file or directory
...s/repository/push.go:96:pushUpdates() [E] Failed to update size for repository: updateSize: lstat [...]/packed-refs.lock: no such file or directory
(cherry picked from commit 16ce00772d4bfba929168533ad58c3a618f28353)
(cherry picked from commit 2aebef847ff998b8c2aa3aad12706698cef078c9)
- The current architecture is inherently insecure, because you can
construct the 'secret' cookie value with values that are available in
the database. Thus provides zero protection when a database is
dumped/leaked.
- This patch implements a new architecture that's inspired from: [Paragonie Initiative](https://paragonie.com/blog/2015/04/secure-authentication-php-with-long-term-persistence#secure-remember-me-cookies).
- Integration testing is added to ensure the new mechanism works.
- Removes a setting, because it's not used anymore.
(cherry picked from commit eff097448b1ebd2a280fcdd55d10b1f6081e9ccd)
[GITEA] rework long-term authentication (squash) add migration
Reminder: the migration is run via integration tests as explained
in the commit "[DB] run all Forgejo migrations in integration tests"
(cherry picked from commit 4accf7443c1c59b4d2e7787d6a6c602d725da403)
(cherry picked from commit 99d06e344ebc3b50bafb2ac4473dd95f057d1ddc)
(cherry picked from commit d8bc98a8f021d381bf72790ad246f923ac983ad4)
(cherry picked from commit 6404845df9a63802fff4c5bd6cfe1e390076e7f0)
(cherry picked from commit 72bdd4f3b9f6509d1ff3f10ecb12c621a932ed30)
(cherry picked from commit 4b01bb0ce812b6c59414ff53fed728563d8bc9cc)
(cherry picked from commit c26ac318162b2cad6ff1ae54e2d8f47a4e4fe7c2)