Database Schema
This section provides an overview of the Mega project database schema.
1.Basic Design of Mega Monorepo
The purpose of this document is to refactor the current storage design of Mega, enabling it to effectively manage project monorepo while remaining compatible with the Git protocol.
Mega's storage structure is mainly divided into the following parts:
Mega Directory:
Similar to the 'tree' in Git, Mega maintains relationships between files and file names. In the database, Mega independently manages directory information for the current version.
Import directory
- The primary purpose of importing directories is to synchronize the original Git repository into the Mega directory. Projects within the import directory are maintained in a read-only state, preserving the original commit information.
- Projects pushed to the import directory can have multiple commits.
- Projects in the import directory can be quickly transformed into the Mega directory.
- Import directories can be configured in the configuration file.
- Once a directory is initialized as an import directory, it cannot be changed back to a regular directory.
2. Database Design
Table Overall
Table Name | Description | Mega Push | Mega Pull | Git Push | Git Repo |
---|---|---|---|---|---|
mega_commit | Store all commit objects related with mega directory, have mr status | ✓ | |||
mega_tree | Store all tree objects related with mega directory, together with mega_commit to find history directory | ✓ | |||
mega_blob | Store all blob objects under mega directory. | ✓ | ✓ | ||
mega_tag | Store all annotated tag with mega directory. | ✓ | ✓ | ||
mega_mr | Merge request related to mega commits. | ✓ | |||
mega_mr_conv | MR conversation list | ✓ | |||
mega_mr_comment | MR Comment | ✓ | |||
mega_issue | Manage mega's issue. | ||||
mega_refs | This table maintains refs information corresponding to each directory of mega | ✓ | |||
git_repo | Maintain Relations between import_repo and repo_path. | ✓ | ✓ | ||
import_refs | Obtains the latest commit_id through repo_id and ref_name, also storing the repo lightweight tags. | ✓ | ✓ | ||
git_commit | Store all parsed commit objects related with repo. | ✓ | ✓ | ||
git_tree | Store all parsed tree objects related with repo. | ✓ | ✓ | ||
git_blob | Store all parsed blob objects related with repo. | ✓ | ✓ | ||
git_tag | Store all annotated tag related with repo. | ✓ | ✓ | ||
raw_blob | Store all raw objects with both git repo and mega directory. | ✓ | ✓ | ✓ | ✓ |
git_pr | Pull request sync from third parties like GitHub. | ||||
git_issue | Issues sync from third parties like GitHub. | ||||
lfs_objects | Store objects related to LFS protocol. | ||||
lfs_locks | Store locks for lfs files. |
3. Table Schema
mega_commit
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the commit |
commit_id | VARCHAR(40) | NOT NULL, UNIQUE | Git commit ID |
tree | VARCHAR(40) | NOT NULL | Tree ID associated with the commit |
parents_id | JSON | NOT NULL | Parent commit IDs (JSON format) |
author | TEXT | Commit author | |
committer | TEXT | Commit committer | |
content | TEXT | Commit message or content | |
created_at | TIMESTAMP | NOT NULL | Timestamp when the commit was created |
Indexes
idx_mc_git_id
: Index oncommit_id
.
mega_tree
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the tree |
tree_id | VARCHAR(40) | NOT NULL | Git tree ID |
sub_trees | BYTEA | NOT NULL | Serialized data of subtrees, contanins name, sha1, mode, repo_id |
size | INT | NOT NULL | Size of the tree |
commit_id | VARCHAR(40) | NOT NULL | Commit ID associated with the tree |
created_at | TIMESTAMP | NOT NULL | Timestamp when the tree was created |
Indexes
idx_mt_git_id
: Index ontree_id
.
mega_blob
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the blob |
blob_id | VARCHAR(40) | NOT NULL | Git blob ID |
commit_id | VARCHAR(40) | NOT NULL | Commit ID associated with the blob |
name | TEXT | NOT NULL | Name of the file or blob |
size | INT | NOT NULL | Size of the blob |
created_at | TIMESTAMP | NOT NULL | Timestamp when the blob was created |
Indexes
idx_mb_git_id
: Index onblob_id
.
mega_tag
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the tag |
tag_id | VARCHAR(40) | NOT NULL, UNIQUE | Git tag ID |
object_id | VARCHAR(40) | NOT NULL | ID of the tagged object |
object_type | VARCHAR(20) | NOT NULL | Type of the tagged object (e.g., commit, tree, blob) |
tag_name | TEXT | NOT NULL | Name of the tag |
tagger | TEXT | NOT NULL | Person who created the tag |
message | TEXT | NOT NULL | Message associated with the tag |
created_at | TIMESTAMP | NOT NULL | Timestamp when the tag was created |
Indexes
uniq_mtag_tag_id
: Unique constraint ontag_id
.
mega_mr
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the merge request (MR) |
link | VARCHAR(40) | NOT NULL | A MR identifier with a length of 6-8 characters. |
title | TEXT | NOT NULL | Title of the MR |
merge_date | TIMESTAMP | Date when the MR was merged | |
status | VARCHAR(20) | NOT NULL | Status of the MR (e.g., open, closed) |
path | TEXT | NOT NULL | Path associated with the MR |
from_hash | VARCHAR(40) | NOT NULL | Source commit hash for the MR |
to_hash | VARCHAR(40) | NOT NULL | Target commit hash for the MR |
created_at | TIMESTAMP | NOT NULL | Timestamp when the MR was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the MR was last updated |
Indexes
idx_mr_path
: Index onpath
.
mega_conversation
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the conversation |
link | VARCHAR(20) | NOT NULL | Link identifier associated with the MR |
user_id | BIGINT | NOT NULL | User ID of the commenter |
conv_type | VARCHAR(20) | NOT NULL | conversation type, can be comment, commit, force push, edit etc |
comment | TEXT | Comment content | |
created_at | TIMESTAMP | NOT NULL | Timestamp when the conversation was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the conversation was last updated |
Indexes
idx_conversation
: Index onlink
.
mega_issue
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the issue |
link | VARCHAR(20) | NOT NULL | Unique link identifier for the issue |
title | VARCHAR(255) | NOT NULL | Title of the issue |
owner | BIGINT | NOT NULL | User ID of the issue owner |
status | VARCHAR(20) | NOT NULL | Status of the issue (e.g., open, closed) |
created_at | TIMESTAMP | NOT NULL | Timestamp when the issue was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the issue was last updated |
closed_at | TIMESTAMP | DEFAULT NULL | Timestamp when the issue was closed |
Indexes
idx_issue
: Index onlink
.
mega_refs
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the reference |
path | TEXT | NOT NULL | Monorepo path refs |
ref_name | TEXT | NOT NULL | Name of the reference (e.g., branch, tag) |
ref_commit_hash | VARCHAR(40) | NOT NULL | Commit hash associated with the reference |
ref_tree_hash | VARCHAR(40) | NOT NULL | Tree hash associated with the reference |
created_at | TIMESTAMP | NOT NULL | Timestamp when the reference was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the reference was last updated |
Constraints
uniq_mref_path
: Unique constraint on(path, ref_name)
.
import_refs
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the imported reference |
repo_id | BIGINT | NOT NULL | Repository ID associated with the reference |
ref_name | TEXT | NOT NULL | Name of the imported reference |
ref_git_id | VARCHAR(40) | NOT NULL | Git ID of the imported reference |
ref_type | VARCHAR(20) | NOT NULL | Type of the reference (e.g., branch, tag) |
default_branch | BOOLEAN | NOT NULL | Whether this is the default branch |
created_at | TIMESTAMP | NOT NULL | Timestamp when the reference was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the reference was last updated |
Constraints
uniq_ref_path_name
: Unique constraint on(repo_id, ref_name)
.
Indexes
idx_refs_repo_id
: Index onrepo_id
.
git_repo
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the repository |
repo_path | TEXT | NOT NULL | Git repo's absolute path under mega directory |
repo_name | TEXT | NOT NULL | Name of the repository |
created_at | TIMESTAMP | NOT NULL | Timestamp when the repository was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the repository was last updated |
Constraints
uniq_ir_path
: Unique constraint onrepo_path
.
Indexes
idx_ir_repo_path
: Index onrepo_path
.
git_commit
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the commit |
repo_id | BIGINT | NOT NULL | Repository ID associated with the commit |
commit_id | VARCHAR(40) | NOT NULL | SHA-1 hash of the commit |
tree | VARCHAR(40) | NOT NULL | SHA-1 hash of the tree object |
parents_id | JSON | NOT NULL | Parent commits in JSON format |
author | TEXT | Author details | |
committer | TEXT | Committer details | |
content | TEXT | Commit message | |
created_at | TIMESTAMP | NOT NULL | Timestamp when the commit was created |
Constraints
uniq_c_git_repo_id
: Unique constraint on(repo_id, commit_id)
.
Indexes
idx_ic_git_id
: Index oncommit_id
.idx_ic_repo_id
: Index onrepo_id
.
git_tree
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the tree object |
repo_id | BIGINT | NOT NULL | Repository ID associated with the tree |
tree_id | VARCHAR(40) | NOT NULL | SHA-1 hash of the tree |
sub_trees | BYTEA | NOT NULL | Serialized data of subtrees |
size | INT | NOT NULL | Size of the tree object |
commit_id | VARCHAR(40) | NOT NULL | Commit ID associated with the tree |
created_at | TIMESTAMP | NOT NULL | Timestamp when the tree was created |
Constraints
uniq_t_git_repo
: Unique constraint on(repo_id, tree_id)
.
Indexes
idx_t_git_id
: Index ontree_id
.idx_t_repo_id
: Index onrepo_id
.
git_blob
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the blob |
repo_id | BIGINT | NOT NULL | Repository ID associated with the blob |
blob_id | VARCHAR(40) | NOT NULL | SHA-1 hash of the blob |
name | VARCHAR(128) | Name of the blob | |
size | INT | NOT NULL | Size of the blob in bytes |
commit_id | VARCHAR(40) | NOT NULL | Commit ID associated with the blob |
created_at | TIMESTAMP | NOT NULL | Timestamp when the blob was created |
Constraints
uniq_b_git_repo
: Unique constraint on(repo_id, blob_id)
.
Indexes
idx_b_git_id
: Index onblob_id
.
git_tag
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the tag |
repo_id | BIGINT | NOT NULL | Repository ID associated with the tag |
tag_id | VARCHAR(40) | NOT NULL | SHA-1 hash of the tag |
object_id | VARCHAR(40) | NOT NULL | Object ID that the tag points to |
object_type | VARCHAR(20) | NOT NULL | Type of the object (e.g., commit, tree, blob) |
tag_name | TEXT | NOT NULL | Name of the tag |
tagger | TEXT | Tagger details | |
message | TEXT | Message associated with the tag | |
created_at | TIMESTAMP | NOT NULL | Timestamp when the tag was created |
Constraints
uniq_gtag_tag_id
: Unique constraint ontag_id
.
raw_blob
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the raw blob |
sha1 | VARCHAR(40) | NOT NULL | SHA-1 hash of the raw blob |
content | TEXT | Raw content | |
file_type | VARCHAR(20) | File type (e.g., text, binary) | |
storage_type | VARCHAR(20) | NOT NULL | Storage type (e.g., database, local-fs, remote_url ) |
data | BYTEA | Binary data | |
local_path | TEXT | Local storage path | |
remote_url | TEXT | URL for remote storage | |
created_at | TIMESTAMP | NOT NULL | Timestamp when the raw blob was created |
Constraints
uniq_rb_sha1
: Unique constraint onsha1
.
Indexes
idx_rb_sha1
: Index onsha1
.
git_pr
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the pull request |
number | BIGINT | NOT NULL | PR number |
title | VARCHAR(255) | NOT NULL | Title of the PR |
state | VARCHAR(255) | NOT NULL | State of the PR (e.g., open, closed) |
created_at | TIMESTAMP | NOT NULL | Timestamp when the PR was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the PR was last updated |
closed_at | TIMESTAMP | DEFAULT NULL | Timestamp when the PR was closed |
merged_at | TIMESTAMP | DEFAULT NULL | Timestamp when the PR was merged |
merge_commit_sha | VARCHAR(200) | DEFAULT NULL | SHA of the merge commit |
repo_id | BIGINT | NOT NULL | Repository ID associated with the PR |
sender_name | VARCHAR(255) | NOT NULL | Name of the PR sender |
sender_id | BIGINT | NOT NULL | ID of the PR sender |
user_name | VARCHAR(255) | NOT NULL | User name of the PR creator |
user_id | BIGINT | NOT NULL | User ID of the PR creator |
commits_url | VARCHAR(255) | NOT NULL | URL for commits associated with the PR |
patch_url | VARCHAR(255) | NOT NULL | URL for the patch |
head_label | VARCHAR(255) | NOT NULL | Label for the source branch |
head_ref | VARCHAR(255) | NOT NULL | Reference to the source branch |
base_label | VARCHAR(255) | NOT NULL | Label for the target branch |
base_ref | VARCHAR(255) | NOT NULL | Reference to the target branch |
git_issue
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID for the issue |
number | BIGINT | NOT NULL | Issue number |
title | VARCHAR(255) | NOT NULL | Title of the issue |
sender_name | VARCHAR(255) | NOT NULL | Name of the issue sender |
sender_id | BIGINT | NOT NULL | ID of the issue sender |
state | VARCHAR(255) | NOT NULL | State of the issue (e.g., open, closed) |
created_at | TIMESTAMP | NOT NULL | Timestamp when the issue was created |
updated_at | TIMESTAMP | NOT NULL | Timestamp when the issue was last updated |
closed_at | TIMESTAMP | DEFAULT NULL | Timestamp when the issue was closed |
repo_id | BIGINT | NOT NULL | Repository ID associated with the issue |
lfs_locks
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | VARCHAR(40) | PRIMARY KEY | Unique ID for the lock |
data | TEXT | NOT NULL | Metadata or lock data |
lfs_objects
Table: Column | Type | Constraints | Description |
---|---|---|---|
oid | VARCHAR(64) | PRIMARY KEY | Unique object ID (hash) |
size | BIGINT | NOT NULL | Size of the object in bytes |
exist | BOOLEAN | NOT NULL | Indicates if the object exists |
splited | BOOLEAN | NOT NULL | Indicates if the object is split |
lfs_split_relations
Table: Column | Type | Constraints | Description |
---|---|---|---|
ori_oid | VARCHAR(64) | NOT NULL | Original object ID |
sub_oid | VARCHAR(64) | NOT NULL | Sub-object ID |
offset | BIGINT | NOT NULL | Offset within the original object |
size | BIGINT | NOT NULL | Size of the sub-object |
Constraints
- Primary Key: Combination of
ori_oid
,sub_oid
, andoffset
.
lfs_split_relations
Table: Column | Type | Constraints | Description |
---|---|---|---|
ori_oid | VARCHAR(64) | NOT NULL, PRIMARY KEY | Original Object ID |
sub_oid | VARCHAR(64) | NOT NULL, PRIMARY KEY | Sub Object ID |
offset | BIGINT | NOT NULL, PRIMARY KEY | Offset of the sub-object |
size | BIGINT | NOT NULL | Size of the sub-object |
ztm_node
Table: Column | Type | Constraints | Description |
---|---|---|---|
peer_id | VARCHAR(64) | PRIMARY KEY | Unique Peer ID |
hub | VARCHAR(64) | Hub information | |
agent_name | VARCHAR(64) | Name of the agent | |
service_name | VARCHAR(64) | Name of the service | |
type | VARCHAR(64) | Node type | |
online | BOOLEAN | NOT NULL | Node online status |
last_online_time | BIGINT | NOT NULL | Last online timestamp |
service_port | INT | NOT NULL | Service port number |
ztm_repo_info
Table: Column | Type | Constraints | Description |
---|---|---|---|
identifier | VARCHAR(128) | PRIMARY KEY | Unique repository identifier |
name | VARCHAR(64) | Repository name | |
origin | VARCHAR(64) | Origin information | |
update_time | BIGINT | NOT NULL | Last update timestamp |
commit | VARCHAR(64) | Last commit hash |
ztm_lfs_info
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique identifier |
file_hash | VARCHAR(256) | File hash | |
hash_type | VARCHAR(64) | Type of hash (e.g., SHA256) | |
file_size | BIGINT | NOT NULL | Size of the file |
creation_time | BIGINT | NOT NULL | Creation timestamp |
peer_id | VARCHAR(64) | Associated peer ID | |
origin | VARCHAR(256) | File origin information |
ztm_nostr_event
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | VARCHAR(128) | PRIMARY KEY | Unique event ID |
pubkey | VARCHAR(128) | Public key of the event | |
created_at | BIGINT | NOT NULL | Event creation timestamp |
kind | INT | Event type | |
tags | TEXT | Associated tags | |
content | TEXT | Event content | |
sig | VARCHAR(256) | Signature of the event |
ztm_nostr_req
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | VARCHAR(128) | PRIMARY KEY | Unique request ID |
subscription_id | VARCHAR(128) | Subscription identifier | |
filters | TEXT | Request filters |
mq_storage
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID |
category | VARCHAR(64) | Category of the message | |
create_time | TIMESTAMP | NOT NULL | Creation timestamp |
content | TEXT | Message content |
ztm_path_mapping
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID |
alias | TEXT | NOT NULL, UNIQUE | Alias for the repository path |
repo_path | TEXT | NOT NULL | Repository path |
created_at | TIMESTAMP | NOT NULL | Creation timestamp |
updated_at | TIMESTAMP | NOT NULL | Last update timestamp |
user
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID |
name | TEXT | NOT NULL | Name of the user |
TEXT | NOT NULL, UNIQUE | Email of the user | |
avatar_url | TEXT | NOT NULL | Avatar URL |
is_github | BOOLEAN | NOT NULL | Indicates if GitHub user |
created_at | TIMESTAMP | NOT NULL | Creation timestamp |
updated_at | TIMESTAMP | Last update timestamp |
ssh_keys
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID |
user_id | BIGINT | NOT NULL, INDEX | Associated user ID |
title | TEXT | NOT NULL | Title of the SSH key |
ssh_key | TEXT | NOT NULL | Public SSH key |
finger | TEXT | NOT NULL, INDEX (prefix: 8) | SSH key fingerprint |
created_at | TIMESTAMP | NOT NULL | Creation timestamp |
access_token
Table: Column | Type | Constraints | Description |
---|---|---|---|
id | BIGINT | PRIMARY KEY | Unique ID |
user_id | BIGINT | NOT NULL, INDEX | Associated user ID |
token | TEXT | NOT NULL, INDEX (prefix: 8) | Access token |
created_at | TIMESTAMP | NOT NULL | Creation timestamp |
4. Prerequisites
- Generating entities: Entities can be generated from the database table structure with the following command
sea-orm-cli generate entity -u "postgres://postgres:$postgres@localhost/mega" -o jupiter/entity/src