Commit graph

127 commits

Author SHA1 Message Date
Eric Mc Sween
aa367bcd1d Merge pull request #24897 from overleaf/em-chunks-concurrency
Concurrency handling for history chunks with Mongo backend

GitOrigin-RevId: 30abe11237c80e7803c8934a20a57a7223afa85a
2025-04-17 08:05:36 +00:00
Brian Gough
cedc96bdd7 Merge pull request #24906 from overleaf/bg-history-redis-read-cache
implement read cache for history-v1 chunks

GitOrigin-RevId: 128de7e9380fd489f68d5045d3333a27018845c2
2025-04-16 08:06:18 +00:00
Brian Gough
62c8af2a93 Merge pull request #24856 from overleaf/bg-history-redis-buffer-tweaks
fix error logging for chunk cache mismatches

GitOrigin-RevId: 85344c4025fdaa6ee916c5438ff38c7c49f4bce3
2025-04-15 08:06:27 +00:00
Brian Gough
835e14b8b2 Merge pull request #24768 from overleaf/bg-history-redis-buffer
test redis caching when loading latest chunk in history-v1

GitOrigin-RevId: f0ee09e5e9e1d7605e228913cb8539be4134e1f7
2025-04-15 08:05:03 +00:00
Brian Gough
d85dbe429d Merge pull request #24745 from overleaf/bg-history-use-consistent-import-for-chunk-store
use consistent import for chunk_store

GitOrigin-RevId: 427b148c53c9d0913b2cdfdc634273a1d8ece060
2025-04-10 08:05:31 +00:00
Brian Gough
f08532dfb0 Merge pull request #24637 from overleaf/bg-history-backup-uninitialised-projects
backup uninitialised projects

GitOrigin-RevId: 9310ef9f803decffbd674024a1ffd33d1960a2c4
2025-04-08 08:05:54 +00:00
Andrew Rumble
dabf610764 Extract getEndDateForRPO method to utils
This will allow sharing with other functionality.

GitOrigin-RevId: a6e11447180511cc3741ca03f4996ef7ceb45ea5
2025-03-27 14:16:28 +00:00
Eric Mc Sween
0e9c310d1d Merge pull request #24390 from overleaf/em-enforce-content-hash-validation
Enforce content hash validation in history

GitOrigin-RevId: 90de21ea86ddc6548001059c41139a2af5b27060
2025-03-24 10:50:01 +00:00
Brian Gough
3f7c88108c Merge pull request #24275 from overleaf/bg-fix-pending-change-timestamp
fix pending change timestamp

GitOrigin-RevId: 9297a4b57ea718e6a2e1ca62388919c62911af6c
2025-03-18 09:05:08 +00:00
Andrew Rumble
9d72eeeeac Add new strategy to verify_sampled_projects
GitOrigin-RevId: d967da41250bb5945d5b8668b212d4a61b4f9d69
2025-03-18 09:04:50 +00:00
Andrew Rumble
273ae4aecd Split healthCheck out into separate module
GitOrigin-RevId: 847d00b696fe6d82f4bd5fea8f9130437c68e7b2
2025-03-13 09:04:47 +00:00
Andrew Rumble
b5f8bfa28e Switch health check to use projects instead of blobs
Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>
GitOrigin-RevId: db1a1c8ce5968e558b0754e5e0da50af89fd80db
2025-03-13 09:04:43 +00:00
Andrew Rumble
19eefebe95 Revert "Switch health check to use projects instead of blobs"
This reverts commit c318b70397ed5e2fcbb07fa019412b56844260ef.

GitOrigin-RevId: 087ae9d21be83bf3dae47c4e6d27eb4e74f387df
2025-03-12 09:06:34 +00:00
Andrew Rumble
087a9daf34 Revert "Split healthCheck out into separate module"
This reverts commit 96061812977d5c854e494cd44163b16a96722b17.

GitOrigin-RevId: f30a185b65a4f1346ed13fa0c6e9ea0852d44335
2025-03-12 09:06:30 +00:00
Andrew Rumble
a7be1f3430 Split healthCheck out into separate module
GitOrigin-RevId: 96061812977d5c854e494cd44163b16a96722b17
2025-03-12 09:06:22 +00:00
Andrew Rumble
c373db4f86 Switch health check to use projects instead of blobs
Co-authored-by: Jakob Ackermann <jakob.ackermann@overleaf.com>
GitOrigin-RevId: c318b70397ed5e2fcbb07fa019412b56844260ef
2025-03-12 09:06:18 +00:00
Andrew Rumble
061d67ee4b Emit more specific errors from backupVerifier
GitOrigin-RevId: 99475608f096be3e35fbaaf1825b99d145ea86f3
2025-03-12 09:05:31 +00:00
Andrew Rumble
36056e75d7 Improve chunk loading in backupVerifier
Brings the process closer to history_store.

We can't use the backup history_store because the keys are generated
differently for chunks than the standard history_store way of doing it.

GitOrigin-RevId: 07adfc0531f6ec0f38bb70ea0fe8ae0d41f508cc
2025-03-12 09:05:26 +00:00
Brian Gough
c233243948 Merge pull request #24200 from overleaf/bg-backup-queue-pending-jobs
fix backup worker and backup scheduler to handle pending projects

GitOrigin-RevId: a97e011615666b3ae2b8fafe26a96d41b3609edd
2025-03-11 09:06:05 +00:00
Andrew Rumble
441c7a89a7 Merge pull request #24204 from overleaf/ar-jpa-add-chunk-verification
[history-v1] add chunk verification

GitOrigin-RevId: 7208ad20872386813bb1c6946283afddb5e8b1cf
2025-03-11 09:05:57 +00:00
Brian Gough
893294e6b8 Merge pull request #24069 from overleaf/bg-backup-errors
more tweaks for backup errors

GitOrigin-RevId: 0f7c7bb5004923c3c22c6e3471bb7152cc3e05e2
2025-03-05 09:05:50 +00:00
Brian Gough
981e91f012 Merge pull request #21763 from overleaf/bg-backup-script
initial script for running backups

GitOrigin-RevId: d22c373de30738d8080d40dce10790f0bdcb9f51
2025-02-24 09:04:32 +00:00
Brian Gough
d2738fda73 Merge pull request #23565 from overleaf/bg-fix-history-metadata-in-projects-collection
fix history metadata in projects collection

GitOrigin-RevId: 18c821ef5966a8470b24dfa60313b09bdda9707d
2025-02-14 09:03:33 +00:00
Jakob Ackermann
64e8d2b8b3 [history-v1] fix backup deletion for postgres projects (#23466)
* [history-v1] fix backup deletion for postgres projects

* [history-v1] convert historyId to string on assignment

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

---------

Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: 5e1033972745a9b72606638f56ebf2147406cc39
2025-02-10 09:05:06 +00:00
Eric Mc Sween
9893f18ca5 Merge pull request #23447 from overleaf/em-reduce-validation-error-logs
Only log validation errors once per flush

GitOrigin-RevId: ee3f656c4c7c09fd7f3ff2462278c9aef81b9bb5
2025-02-07 09:06:10 +00:00
Eric Mc Sween
3c1f20a6d1 Merge pull request #23433 from overleaf/em-do-not-store-content-hashes
Do not store content hashes in chunks

GitOrigin-RevId: 65a255b92f9c4e216ad5a1fb5fb3fa5a6b9158c4
2025-02-07 09:06:03 +00:00
Eric Mc Sween
ce4c8a4e47 Merge pull request #23398 from overleaf/em-log-doc-hash-mismatches
Validate content hashes in history (log only)

GitOrigin-RevId: ed772fc4e4d0aa9e980f9693a759647bd937e13a
2025-02-07 09:05:59 +00:00
Eric Mc Sween
e145667a81 Merge pull request #23282 from overleaf/em-async-await-persist-changes
Convert the history changes import code to async/await

GitOrigin-RevId: 6421fcaaf3bac69a3404754f935b4902979b4689
2025-02-07 09:05:44 +00:00
Jakob Ackermann
26321c5dba [history-v1] block deletion of bucket prefix that is not a valid project prefix (#22889)
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: a3aff76a2e299b2e2ed030e34da0631fce51ad4b
2025-02-07 09:05:01 +00:00
Brian Gough
b2ff14f669 Merge pull request #23375 from overleaf/bg-add-id-checks-for-chunks
guard against non-postgres projectIds

GitOrigin-RevId: eab2024e4e893591f4b1c6a507b26d935273ae5f
2025-02-05 09:06:54 +00:00
Brian Gough
098d91f0bb Merge pull request #23345 from overleaf/bg-write-latest-history-version-to-project
update project entry with history metadata on chunk creation

GitOrigin-RevId: dd19898f3d16e2e3360ff1bcccbf79f7dd27addb
2025-02-05 09:05:28 +00:00
Jakob Ackermann
3a4c5a0d0f [history-v1] add readOnly lookup for raw chunks (#23318)
* [history-v1] add readOnly lookup for raw chunks

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>

* [history-v1] reduce min poolsize for readOnly pool to 0

Co-authored-by: Brian Gough <brian.gough@overleaf.com>

---------

Co-authored-by: Eric Mc Sween <eric.mcsween@overleaf.com>
Co-authored-by: Brian Gough <brian.gough@overleaf.com>
GitOrigin-RevId: a711c4ee4f3ea3775bd090e620d1ef52689fa1f4
2025-02-04 09:04:52 +00:00
Jakob Ackermann
c6c623da78 [project-history] script for fixing-up files/metadata with bulk resync (#23184)
* [history-v1] add cheap endpoint for checking time of last history write

The /raw endpoint skips the GCS lookup for the chunk.

* [project-history] script for fixing-up files/metadata with bulk resync

* [project-history] upgrade structure only resync when full sync is needed

* [project-history] start resync and process resync updates under lock

* [project-history] stop retrying during graceful shutdown

GitOrigin-RevId: 73184d5786e1d40f5b7e21f387fc37cf43f0ac2d
2025-02-03 09:05:43 +00:00
Jakob Ackermann
70fd6cacbc Merge pull request #22711 from overleaf/jpa-gzip
[history-v1] compress blobs before sending them to AWS

GitOrigin-RevId: 1ca1dda6f36738fbabbf00fdab62b86230b9e4f9
2025-01-08 09:04:57 +00:00
Andrew Rumble
2262d03a21 Merge pull request #22648 from overleaf/ar-store-backup-blobs-by-history-id
Use historyId when constructing path for backup

GitOrigin-RevId: 954576b509d5e78511b5008fb7d74e0bc5fa45fd
2024-12-24 09:04:52 +00:00
Andrew Rumble
31e8a908ee Merge pull request #22334 from overleaf/ar-guard-against-integer-like-strings-when-working-with-postgres
[history-v1] Guard against non-postgres projectIds

GitOrigin-RevId: 5bf75c67424297f52f2abd9d0f0f14a0f79f8921
2024-12-13 09:04:59 +00:00
Andrew Rumble
a92a37bc3c Merge pull request #22466 from overleaf/ar-backup-files-when-inserting
[history-v1] backup files when inserting

GitOrigin-RevId: e636bce178604978c6d41c083bf671795d20b5a1
2024-12-13 09:04:54 +00:00
Andrew Rumble
52254b5695 Merge pull request #22459 from overleaf/revert-22392-ar-backup-files-when-inserting
Revert "[history-v1] backup files when inserting"

GitOrigin-RevId: f21d49dbc8909ab93bdde78c321672124bb13697
2024-12-12 09:05:27 +00:00
Andrew Rumble
6404e3047d Merge pull request #22392 from overleaf/ar-backup-files-when-inserting
[history-v1] backup files when inserting

GitOrigin-RevId: 1649b2828899d67ee37c0ac331917c6d5424c803
2024-12-12 09:05:11 +00:00
Brian Gough
104ae341b1 Merge pull request #22327 from overleaf/bg-fix-copy-blob
fix bug that prevents copying blobs between different backends in history-v1

GitOrigin-RevId: 41140ad42d0d7c1beda83e588649127c22603dec
2024-12-05 09:05:22 +00:00
Jakob Ackermann
d19c5e236f Merge pull request #22208 from overleaf/jpa-clsi-hash
[misc] clsi: read files from history-v1 with fallback to filestore

GitOrigin-RevId: c54bb128780198c14e7a63818f39fad62ce65d4e
2024-11-29 09:05:39 +00:00
Jakob Ackermann
ce0d5fd383 Merge pull request #22177 from overleaf/jpa-file-view-hash-1
[web] migrate file-view to download from history-v1 (via web) 1/2

GitOrigin-RevId: b787e90c57af5e2704b06ba63502aa6fc09ea1df
2024-11-28 09:06:33 +00:00
Brian Gough
be90a3b2bb Merge pull request #22170 from overleaf/bg-history-v1-copy-blob
add copyBlob support to history-v1

GitOrigin-RevId: 797ea66c37ca938fc906c4dff7bb1c8bf14c031e
2024-11-28 09:05:30 +00:00
Jakob Ackermann
3d7254b419 Merge pull request #22153 from overleaf/jpa-backup-verifier-minimal
[history-v1] backup-verifier-app: initial revision

GitOrigin-RevId: 922c9f94cb7ca7c129e38fd6961d42bdff819cd8
2024-11-27 09:04:55 +00:00
Jakob Ackermann
73aea01f37 Merge pull request #21996 from overleaf/jpa-stream-pg-result
[history-v1] postgres: getProjectBlobsBatch: stream records

GitOrigin-RevId: 94ed6dfc4a263fd9369cd380e6cc25c7bbf6decc
2024-11-21 09:04:38 +00:00
Jakob Ackermann
0253130c36 Merge pull request #21972 from overleaf/jpa-get-project-blobs-batch
[history-v1] implement getProjectBlobsBatch

GitOrigin-RevId: f03dcc690ef63f72400ccf001c6e497bd4fbe790
2024-11-20 09:05:34 +00:00
Jakob Ackermann
24f2388aa2 Merge pull request #21948 from overleaf/bg-jpa-back-fill-project-blobs
[history-v1] back_fill_file_hash: process blobs

GitOrigin-RevId: e54d0f8ab537ce43a12f9c972ba2ee82836073c8
2024-11-20 09:05:04 +00:00
Jakob Ackermann
fb36fff63d Merge pull request #21931 from overleaf/bg-get-all-blobs-for-project
add getProjectBlobs method to retrieve metadata for all blobs in a project

GitOrigin-RevId: 38f504a4fb56cd8ef8beaff1d8917ead26e85f5a
2024-11-20 09:04:56 +00:00
Jakob Ackermann
27076c50cc Merge pull request #21670 from overleaf/jpa-mongo-backend-types
[history-v1] add types to mongo BlobStore backend

GitOrigin-RevId: 7d91074eaa781904f7f3b56390aacee1800a7f67
2024-11-19 09:05:23 +00:00
Jakob Ackermann
ca0a46b5bb Merge pull request #21928 from overleaf/jpa-handle-already-hard-deleted
[history-v1] backup-deletion-app: use deletedProjectOverleafHistoryId

GitOrigin-RevId: 169ba0fba71c42b0415e5fa40424547b054dd5b0
2024-11-18 09:06:13 +00:00