This updates the repo index/file view endpoints so annex files match the way
LFS files are rendered, making annexed files accessible via the web instead of
being black boxes only accessible by git clone.
This mostly just duplicates the existing LFS logic. It doesn't try to combine itself
with the existing logic, to make merging with upstream easier. If upstream ever
decides to accept, I would like to try to merge the redundant logic.
The one bit that doesn't directly copy LFS is my choice to hide annex-symlinks.
LFS files are always _pointer files_ and therefore always render with the "file"
icon and no special label, but annex files come in two flavours: symlinks or
pointer files. I've conflated both kinds to try to give a consistent experience.
The tests in here ensure the correct download link (/media, from the last PR)
renders in both the toolbar and, if a binary file (like most annexed files will be),
in the main pane, but it also adds quite a bit of code to make sure text files
that happen to be annexed are dug out and rendered inline like LFS files are.
This moves the `annexObjectPath()` helper out of the tests and into a
dedicated sub-package as `annex.ContentLocation()`, and expands it with
`.Pointer()` (which validates using `git annex examinekey`),
`.IsAnnexed()` and `.Content()` to make it a more useful module.
The tests retain their own wrapper version of `ContentLocation()`
because I tried to follow close to the API modules/lfs uses, which in
terms of abstract `git.Blob` and `git.TreeEntry` objects, not in terms
of `repoPath string`s which are more convenient for the tests.
This makes HTTP symmetric with SSH clone URLs.
This gives us the fancy feature of _anonymous_ downloads,
so people can access datasets without having to set up an
account or manage ssh keys.
Previously, to access "open access" data shared this way,
users would need to:
1. Create an account on gitea.example.com
2. Create ssh keys
3. Upload ssh keys (and make sure to find and upload the correct file)
4. `git clone git@gitea.example.com:user/dataset.git`
5. `cd dataset`
6. `git annex get`
This cuts that down to just the last three steps:
1. `git clone https://gitea.example.com/user/dataset.git`
2. `cd dataset`
3. `git annex get`
This is significantly simpler for downstream users, especially for those
unfamiliar with the command line.
Unfortunately there's no uploading. While git-annex supports uploading
over HTTP to S3 and some other special remotes, it seems to fail on a
_plain_ HTTP remote. See https://github.com/neuropoly/gitea/issues/7
and https://git-annex.branchable.com/forum/HTTP_uploads/#comment-ce28adc128fdefe4c4c49628174d9b92.
This is not a major loss since no one wants uploading to be anonymous anyway.
To support private repos, I had to hunt down and patch a secret extra security
corner that Gitea only applies to HTTP for some reason (services/auth/basic.go).
This was guided by https://git-annex.branchable.com/tips/setup_a_public_repository_on_a_web_site/
Fixes https://github.com/neuropoly/gitea/issues/3
Co-authored-by: Mathieu Guay-Paquet <mathieu.guaypaquet@polymtl.ca>
Fixes https://github.com/neuropoly/gitea/issues/11
Tests:
* `git annex init`
* `git annex copy --from origin`
* `git annex copy --to origin`
over:
* ssh
for:
* the owner
* a collaborator
* a read-only collaborator
* a stranger
in a
* public repo
* private repo
And then confirms:
* Deletion of the remote repo (to ensure lockdown isn't messing with us: https://git-annex.branchable.com/internals/lockdown/#comment-0cc5225dc5abe8eddeb843bfd2fdc382)
------
To support all this:
* Add util.FileCmp()
* Patch withKeyFile() so it can be nested in other copies of itself
-------
Many thanks to Mathieu for giving style tips and catching several bugs,
including a subtle one in util.filecmp() which neutered it.
Co-authored-by: Mathieu Guay-Paquet <mathieu.guay-paquet@polymtl.ca>
[git-annex](https://git-annex.branchable.com/) is a more complicated cousin to
git-lfs, storing large files in an optional-download side content. Unlike lfs,
it allows mixing and matching storage remotes, so the content remote(s) doesn't
need to be on the same server as the git remote, making it feasible to scatter
a collection across cloud storage, old harddrives, or anywhere else storage can
be scavenged. Since this can get complicated, fast, it has a content-tracking
database (`git annex whereis`) to help find everything later.
The use-case we imagine for including it in Gitea is just the simple case, where
we're primarily emulating git-lfs: each repo has its large content at the same URL.
Our motivation is so we can self-host https://www.datalad.org/ datasets, which
currently are only hostable by fragilely scrounging together cloud storage --
and having to manage all the credentials associated with all the pieces -- or at
https://openneuro.org which is fragile in its own ways.
Supporting git-annex also allows multiple Gitea instance to be annex remotes for
each other, mirroring the content or otherwise collaborating the split up the
hosting costs.
Enabling
--------
TODO
HTTP
----
TODO
Permission Checking
-------------------
This tweaks the API in routers/private/serv.go to expose the calling user's
computed permission, instead of just returning HTTP 403.
This doesn't fit in super well. It's the opposite from how the git-lfs support is
done, where there's a complete list of possible subcommands and their matching
permission levels, and then the API compares the requested with the actual level
and returns HTTP 403 if the check fails.
But it's necessary. The main git-annex verbs, 'git-annex-shell configlist' and
'git-annex-shell p2pstdio' are both either read-only or read-write operations,
depending on the state on disk on either end of the connection and what the user
asked it to ask for, with no way to know before git-annex examines the situation.
So tell the level via GIT_ANNEX_READONLY and trust it to handle itself.
In the older Gogs version, the permission was directly read in cmd/serv.go:
```
mode, err = db.UserAccessMode(user.ID, repo)
```
- 966e925cf3/internal/cmd/serv.go (L334)
but in Gitea permission enforcement has been centralized in the API layer.
(perhaps so the cmd layer can avoid making direct DB connections?)
Deletion
--------
git-annex has this "lockdown" feature where it tries
really quite very hard to prevent you deleting its
data, to the point that even an rm -rf won't do it:
each file in annex/objects/ is nested inside a
folder with read-only permissions.
The recommended workaround is to run chmod -R +w when
you're sure you actually want to delete a repo. See
https://git-annex.branchable.com/internals/lockdown
So we edit util.RemoveAll() to do just that, so now
it's `chmod -R +w && rm -rf` instead of just `rm -rf`.
Backport #29295 by @lunny
Fix#28843
This PR will bypass the pushUpdateTag to database failure when
syncAllTags. An error log will be recorded.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit b78f5fc60f510a58d58535af77c5b424a8b5a660)
- Backport of #2403
- In markdown, links are proccessed to be made absolute against the
relevant base in that context. Such that `./src` will be transformed
into `http://example.com/owner/repo/src/branch/main/src`.
- Don't try to make the link absolute if the link has a schema that's
defined in `[markdown].CUSTOM_URL_SCHEMES`, because they can't be made
absolute and doing so could lead to problems (see test case, double
slash was transformed to single slash).
- Adds unit test.
- Resolves https://codeberg.org/Codeberg/Community/issues/1489
(cherry picked from commit 65b9a959b8)
- Backport of #2385
- For regular non-image nonvideo links, they should be made relative,
this was done against `r.Ctx.Links.Base`, but since 637451a45e, that
should instead be done by `SrcLink()` if there's branch information set
in the context, because branch and treepath information are no longer
set in `r.Ctx.Links.Base`.
- This is consistent with how #2166 _fixed_ relative links.
- Media is not affected, `TestRender_Media` test doesn't fail.
- Adds unit tests.
- Ref https://codeberg.org/Codeberg/Community/issues/1485
(cherry picked from commit a2442793d2)
- Backport #2335
- In Git version v2.43.1, the behavior of `GIT_FLUSH` was accidentially
flipped. This causes Forgejo to hang on the `check-attr` command,
because no output was being flushed.
- Workaround this by detecting if Git v2.43.1 is used and set
`GIT_FLUSH=0` thus getting the correct behavior.
- Ref: https://lore.kernel.org/git/CABn0oJvg3M_kBW-u=j3QhKnO=6QOzk-YFTgonYw_UvFS1NTX4g@mail.gmail.com/
- Resolves#2333.
(cherry picked from commit f68f880974)
Backport #29062 by @inferno-umar
Fix for gitea putting everything into one request without batching and
sending it to Elasticsearch for indexing as issued in #28117
This issue occured in large repositories while Gitea tries to
index the code using ElasticSearch.
Co-authored-by: dark-angel <70754989+inferno-umar@users.noreply.github.com>
(cherry picked from commit f0d34cd3b97dd2c9f29fc401ec58ea0661b7ca7d)
- Backport of #2276
- It's possible that the description of an `Regularlink` is `Text` and not
another `Regularlink`. Therefor if it's `Text`, convert it to an
`Regularlink` trough the 'old' behavior (pass it trough `org.String` and
trim `file:` prefix).
- Adds unit tests.
- Resolves https://codeberg.org/Codeberg/Community/issues/1430
(cherry picked from commit 385fc6ee6be25859066a716aa15be09991e2d33c)
Backport #28893
(cherry picked from commit c33886b7100f1b92c763435e59af262879817f76)
Conflicts:
go.sum
trivial conflict because of 120294c44e * [GITEA] Use maintained gziphandler
Backport #28817 by @lunny
Fix#22066
# Purpose
This PR fix the releases will be deleted when mirror repository sync the
tags.
# The problem
In the previous implementation of #19125. All releases record in
databases of one mirror repository will be deleted before sync.
Ref:
https://github.com/go-gitea/gitea/pull/19125/files#diff-2aa04998a791c30e5a02b49a97c07fcd93d50e8b31640ce2ddb1afeebf605d02R481
# The Pros
This PR introduced a new method which will load all releases from
databases and all tags on git data into memory. And detect which tags
needs to be inserted, which tags need to be updated or deleted. Only
tags releases(IsTag=true) which are not included in git data will be
deleted, only tags which sha1 changed will be updated. So it will not
delete any real releases include drafts.
# The Cons
The drawback is the memory usage will be higher than before if there are
many tags on this repository. This PR defined a special release struct
to reduce columns loaded from database to memory.
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
(cherry picked from commit 2048363f9ed6de485a81afa980ed90bf916bb3b8)
Backport #28935 by @silverwind
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:
- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.
Fixes: https://github.com/go-gitea/gitea/issues/28743
Related: https://github.com/go-gitea/gitea/issues/6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
Co-authored-by: silverwind <me@silverwind.io>
(cherry picked from commit b8e6cffd317401d980600e339eb21b15b9bc64c1)
Backport #28877 by @KN4CK3R
Fixes#28875
If `RequireSignInView` is enabled, the ghost user has no access rights.
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
(cherry picked from commit b7c944b9e4e9f847719fbce421b2f4fee7281187)
Backport #28848 by @brechtvl
When LFS hooks are present in gitea-repositories, operations like git
push for creating a pull request fail. These repositories are not meant
to include LFS files or git push them, that is handled separately. And
so they should not have LFS hooks.
Installing git-lfs on some systems (like Debian Linux) will
automatically set up /etc/gitconfig to create LFS hooks in repositories.
For most git commands in Gitea this is not a problem, either because
they run on a temporary clone or the git command does not create LFS
hooks.
But one case where this happens is git archive for creating repository
archives. To fix that, add a GIT_CONFIG_NOSYSTEM=1 to disable using the
system configuration for that command.
According to a comment, GIT_CONFIG_NOSYSTEM is not used for all git
commands because the system configuration can be intentionally set up
for Gitea to use.
Resolves#19810, #21148
Co-authored-by: Brecht Van Lommel <brecht@blender.org>
(cherry picked from commit 0d50f274698e7508ad15f61b1eca41db677b762e)
Backport #28824 by @lunny
`checkInit` has been invoked in `InitSimple`. So it's unnecessary to
invoke it twice in `InitFull`.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
(cherry picked from commit 8b5c9186a54b2e177fc58295ecd83bfc4700cbfd)
- Backport of #2166
- Relative links were not properly being rendered, because the links
were being made absolute against the repository URL instead of
repository URL + /src/branch, which leads to incorrect links.
- Restore the 'old' behaviour. When there's branch information, that
should be used as base for links.
- Adjusts the test cases.
- Regression of 637451a45e
- Resolves https://codeberg.org/Codeberg/Community/issues/1411
(cherry picked from commit 0e9d52e2918004ac183910c712e9fe486e139e05)
Backport #28797 by @lunny
Fix#28694
Generally, `refname:short` should be equal to `refname:lstrip=2` except
`core.warnAmbiguousRefs is used to select the strict abbreviation mode.`
ref:
https://git-scm.com/docs/git-for-each-ref#Documentation/git-for-each-ref.txt-refname
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
(cherry picked from commit 4746291b0863eaf9c629c944bd48295ca2f69d2f)
Backport #26745Fixes#26548
This PR refactors the rendering of markup links. The old code uses
`strings.Replace` to change some urls while the new code uses more
context to decide which link should be generated.
The added tests should ensure the same output for the old and new
behaviour (besides the bug).
We may need to refactor the rendering a bit more to make it clear how
the different helper methods render the input string. There are lots of
options (resolve links / images / mentions / git hashes / emojis / ...)
but you don't really know what helper uses which options. For example,
we currently support images in the user description which should not be
allowed I think:
<details>
<summary>Profile</summary>
https://try.gitea.io/KN4CK3R

</details>
(cherry picked from commit 022552d5b6adc792d3cd16df7de6e52cb7b41a72)
Backport #28796 by @wxiaoguang
`resp != nil` doesn't mean the request really succeeded. Add a comment
for requestJSONResp to clarify the behavior.
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit cbf366643bfbc89a1fbc8a149e31abf19c60d6a9)
Backport #28708 by wxiaoguang
Regression of #27723Fix#28705
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit 7f833d8f71b71b0e983ed979c5ad8088aec7fb7d)
- The current architecture is inherently insecure, because you can
construct the 'secret' cookie value with values that are available in
the database. Thus provides zero protection when a database is
dumped/leaked.
- This patch implements a new architecture that's inspired from: [Paragonie Initiative](https://paragonie.com/blog/2015/04/secure-authentication-php-with-long-term-persistence#secure-remember-me-cookies).
- Integration testing is added to ensure the new mechanism works.
- Removes a setting, because it's not used anymore.
(cherry picked from commit eff097448b1ebd2a280fcdd55d10b1f6081e9ccd)
[GITEA] rework long-term authentication (squash) add migration
Reminder: the migration is run via integration tests as explained
in the commit "[DB] run all Forgejo migrations in integration tests"
(cherry picked from commit 4accf7443c1c59b4d2e7787d6a6c602d725da403)
(cherry picked from commit 99d06e344ebc3b50bafb2ac4473dd95f057d1ddc)
(cherry picked from commit d8bc98a8f021d381bf72790ad246f923ac983ad4)
(cherry picked from commit 6404845df9a63802fff4c5bd6cfe1e390076e7f0)
(cherry picked from commit 72bdd4f3b9f6509d1ff3f10ecb12c621a932ed30)
(cherry picked from commit 4b01bb0ce812b6c59414ff53fed728563d8bc9cc)
(cherry picked from commit c26ac318162b2cad6ff1ae54e2d8f47a4e4fe7c2)
(cherry picked from commit 8d2dab94a6)
Conflicts:
routers/web/auth/auth.go
https://codeberg.org/forgejo/forgejo/issues/2158
- https://github.com/NYTimes/gziphandler doesn't seems to be maintained
anymore and Forgejo already includes
https://github.com/klauspost/compress which provides a maintained and
faster gzip handler fork.
- Enables Jitter to prevent BREACH attacks, as this *seems* to be
possible in the context of Forgejo.
(cherry picked from commit cc2847241d82001babd8d40c87d03169f21c14cd)
(cherry picked from commit 99ba56a8761dd08e08d9499cab2ded1a6b7b970f)
Conflicts:
go.sum
https://codeberg.org/forgejo/forgejo/pulls/1581
(cherry picked from commit 711638193daa2311e2ead6249a47dcec47b4e335)
(cherry picked from commit 9c12a37fde6fa84414bf332ff4a066facdb92d38)
(cherry picked from commit 91191aaaedaf999209695e2c6ca4fb256b396686)
(cherry picked from commit 72be417f844713265a94ced6951f8f4b81d0ab1a)
(cherry picked from commit 98497c84da205ec59079e42274aa61199444f7cd)
(cherry picked from commit fba042adb5c1abcbd8eee6b5a4f735ccb2a5e394)
(cherry picked from commit dd2414f226)
Conflicts:
routers/web/web.go
https://codeberg.org/forgejo/forgejo/issues/2016
Backport #28587, the only conflict is the test file.
The CORS code has been unmaintained for long time, and the behavior is
not correct.
This PR tries to improve it. The key point is written as comment in
code. And add more tests.
Fix#28515Fix#27642Fix#17098
(cherry picked from commit 7a2786ca6cd84633784a2c9986da65a9c4d79c78)
do not reuse the payload of the event that triggered the creation of
the scheduled event. Create a new one instead that contains no other
information than the event name in the action field ("schedule").
(cherry picked from commit 0b40ca1ea5e6b704bcb6c0d370a21f633facc7d6)
handleSchedules() is called every time an event is received and will
check the content of the main branch to (re)create scheduled events.
There is no reason why intput.Event will be relevant when the schedule
workflow runs.
(cherry picked from commit 9a712bb276f2103cd7bccc4bb07b6cc669537e38)
Backport #28491 by @appleboy
- Modify the `Password` field in `CreateUserOption` struct to remove the
`Required` tag
- Update the `v1_json.tmpl` template to include the `email` field and
remove the `password` field
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Co-authored-by: Bo-Yi Wu <appleboy.tw@gmail.com>
(cherry picked from commit 411310d698e86bd639b31f2f5a8b856365b4590f)
Backport #28454 (the only conflict is caused by some comments)
* Close#24483
* Close#28123
* Close#23682
* Close#23149
(cherry picked from commit a3f403f438e7f5b5dca3a5042fae8e68a896b1e7)
Conflicts:
modules/setting/ui.go
trivial context conflict
Backport #28487 by @earl-warren
- When a repository is orphaned and has objects stored in any of the
storages such as repository avatar or attachments the delete function
would error, because the storage module wasn't initalized.
- Add code to initialize the storage module.
Refs: https://codeberg.org/forgejo/forgejo/pulls/1954
Co-authored-by: Earl Warren <109468362+earl-warren@users.noreply.github.com>
Co-authored-by: Gusted <postmaster@gusted.xyz>
(cherry picked from commit 8ee1ed877b0207d9d8733ac32270325c54659909)
It will determine how anchors are created and will break existing
links otherwise.
Adapted from Revert "Make `user-content-* ` consistent with github (#26388)
It shows warnings although the setting is not set, this will surely be
fixed later but there is no sense in spaming the users right now. This
revert can be discarded when another fix lands in v1.21.
su -c "forgejo admin user generate-access-token -u root --raw --scopes 'all,sudo'" git
2023/12/12 15:54:45 .../setting/security.go:166:loadSecurityFrom() [W] Enabling Query API Auth tokens is not recommended. DISABLE_QUERY_AUTH_TOKEN will default to true in gitea 1.23 and will be removed in gitea 1.24.
This reverts commit 0e3a5abb69.
Conflicts:
routers/api/v1/api.go
Backport #28390 by @jackHay22
## Changes
- Add deprecation warning to `Token` and `AccessToken` authentication
methods in swagger.
- Add deprecation warning header to API response. Example:
```
HTTP/1.1 200 OK
...
Warning: token and access_token API authentication is deprecated
...
```
- Add setting `DISABLE_QUERY_AUTH_TOKEN` to reject query string auth
tokens entirely. Default is `false`
## Next steps
- `DISABLE_QUERY_AUTH_TOKEN` should be true in a subsequent release and
the methods should be removed in swagger
- `DISABLE_QUERY_AUTH_TOKEN` should be removed and the implementation of
the auth methods in question should be removed
## Open questions
- Should there be further changes to the swagger documentation?
Deprecation is not yet supported for security definitions (coming in
[OpenAPI Spec version
3.2.0](https://github.com/OAI/OpenAPI-Specification/issues/2506))
- Should the API router logger sanitize urls that use `token` or
`access_token`? (This is obviously an insufficient solution on its own)
Co-authored-by: Jack Hay <jack@allspice.io>
Co-authored-by: delvh <dev.lh@web.de>
(cherry picked from commit f144521aea0d7a08b9bd5f17e49bae4021bd7a45)