Database Schema

This section provides an overview of the Mega project database schema.

1.Basic Design of Mega Monorepo

The purpose of this document is to refactor the current storage design of Mega, enabling it to effectively manage project monorepo while remaining compatible with the Git protocol.

Mega's storage structure is mainly divided into the following parts:

Mega Directory:

Similar to the 'tree' in Git, Mega maintains relationships between files and file names. In the database, Mega independently manages directory information for the current version.

Import directory

  • The primary purpose of importing directories is to synchronize the original Git repository into the Mega directory. Projects within the import directory are maintained in a read-only state, preserving the original commit information.
  • Projects pushed to the import directory can have multiple commits.
  • Projects in the import directory can be quickly transformed into the Mega directory.
  • Import directories can be configured in the configuration file.
  • Once a directory is initialized as an import directory, it cannot be changed back to a regular directory.

2. Database Design

Table Overall

Table NameDescriptionMega PushMega PullGit PushGit Repo
mega_commitStore all commit objects related with mega directory, have mr status
mega_treeStore all tree objects related with mega directory, together with mega_commit to find history directory
mega_blobStore all blob objects under mega directory.
mega_tagStore all annotated tag with mega directory.
mega_mrMerge request related to mega commits.
mega_mr_convMR conversation list
mega_mr_commentMR Comment
mega_issueManage mega's issue.
mega_refsThis table maintains refs information corresponding to each directory of mega
git_repoMaintain Relations between import_repo and repo_path.
import_refsObtains the latest commit_id through repo_id and ref_name, also storing the repo lightweight tags.
git_commitStore all parsed commit objects related with repo.
git_treeStore all parsed tree objects related with repo.
git_blobStore all parsed blob objects related with repo.
git_tagStore all annotated tag related with repo.
raw_blobStore all raw objects with both git repo and mega directory.
git_prPull request sync from third parties like GitHub.
git_issueIssues sync from third parties like GitHub.
lfs_objectsStore objects related to LFS protocol.
lfs_locksStore locks for lfs files.

3. Table Schema

Table: mega_commit

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the commit
commit_idVARCHAR(40)NOT NULL, UNIQUEGit commit ID
treeVARCHAR(40)NOT NULLTree ID associated with the commit
parents_idJSONNOT NULLParent commit IDs (JSON format)
authorTEXTCommit author
committerTEXTCommit committer
contentTEXTCommit message or content
created_atTIMESTAMPNOT NULLTimestamp when the commit was created

Indexes

  • idx_mc_git_id: Index on commit_id.

Table: mega_tree

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the tree
tree_idVARCHAR(40)NOT NULLGit tree ID
sub_treesBYTEANOT NULLSerialized data of subtrees, contanins name, sha1, mode, repo_id
sizeINTNOT NULLSize of the tree
commit_idVARCHAR(40)NOT NULLCommit ID associated with the tree
created_atTIMESTAMPNOT NULLTimestamp when the tree was created

Indexes

  • idx_mt_git_id: Index on tree_id.

Table: mega_blob

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the blob
blob_idVARCHAR(40)NOT NULLGit blob ID
commit_idVARCHAR(40)NOT NULLCommit ID associated with the blob
nameTEXTNOT NULLName of the file or blob
sizeINTNOT NULLSize of the blob
created_atTIMESTAMPNOT NULLTimestamp when the blob was created

Indexes

  • idx_mb_git_id: Index on blob_id.

Table: mega_tag

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the tag
tag_idVARCHAR(40)NOT NULL, UNIQUEGit tag ID
object_idVARCHAR(40)NOT NULLID of the tagged object
object_typeVARCHAR(20)NOT NULLType of the tagged object (e.g., commit, tree, blob)
tag_nameTEXTNOT NULLName of the tag
taggerTEXTNOT NULLPerson who created the tag
messageTEXTNOT NULLMessage associated with the tag
created_atTIMESTAMPNOT NULLTimestamp when the tag was created

Indexes

  • uniq_mtag_tag_id: Unique constraint on tag_id.

Table: mega_mr

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the merge request (MR)
linkVARCHAR(40)NOT NULLA MR identifier with a length of 6-8 characters.
titleTEXTNOT NULLTitle of the MR
merge_dateTIMESTAMPDate when the MR was merged
statusVARCHAR(20)NOT NULLStatus of the MR (e.g., open, closed)
pathTEXTNOT NULLPath associated with the MR
from_hashVARCHAR(40)NOT NULLSource commit hash for the MR
to_hashVARCHAR(40)NOT NULLTarget commit hash for the MR
created_atTIMESTAMPNOT NULLTimestamp when the MR was created
updated_atTIMESTAMPNOT NULLTimestamp when the MR was last updated

Indexes

  • idx_mr_path: Index on path.

Table: mega_conversation

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the conversation
linkVARCHAR(20)NOT NULLLink identifier associated with the MR
user_idBIGINTNOT NULLUser ID of the commenter
conv_typeVARCHAR(20)NOT NULLconversation type, can be comment, commit, force push, edit etc
commentTEXTComment content
created_atTIMESTAMPNOT NULLTimestamp when the conversation was created
updated_atTIMESTAMPNOT NULLTimestamp when the conversation was last updated

Indexes

  • idx_conversation: Index on link.

Table: mega_issue

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the issue
linkVARCHAR(20)NOT NULLUnique link identifier for the issue
titleVARCHAR(255)NOT NULLTitle of the issue
ownerBIGINTNOT NULLUser ID of the issue owner
statusVARCHAR(20)NOT NULLStatus of the issue (e.g., open, closed)
created_atTIMESTAMPNOT NULLTimestamp when the issue was created
updated_atTIMESTAMPNOT NULLTimestamp when the issue was last updated
closed_atTIMESTAMPDEFAULT NULLTimestamp when the issue was closed

Indexes

  • idx_issue: Index on link.

Table: mega_refs

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the reference
pathTEXTNOT NULLMonorepo path refs
ref_nameTEXTNOT NULLName of the reference (e.g., branch, tag)
ref_commit_hashVARCHAR(40)NOT NULLCommit hash associated with the reference
ref_tree_hashVARCHAR(40)NOT NULLTree hash associated with the reference
created_atTIMESTAMPNOT NULLTimestamp when the reference was created
updated_atTIMESTAMPNOT NULLTimestamp when the reference was last updated

Constraints

  • uniq_mref_path: Unique constraint on (path, ref_name).

Table: import_refs

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the imported reference
repo_idBIGINTNOT NULLRepository ID associated with the reference
ref_nameTEXTNOT NULLName of the imported reference
ref_git_idVARCHAR(40)NOT NULLGit ID of the imported reference
ref_typeVARCHAR(20)NOT NULLType of the reference (e.g., branch, tag)
default_branchBOOLEANNOT NULLWhether this is the default branch
created_atTIMESTAMPNOT NULLTimestamp when the reference was created
updated_atTIMESTAMPNOT NULLTimestamp when the reference was last updated

Constraints

  • uniq_ref_path_name: Unique constraint on (repo_id, ref_name).

Indexes

  • idx_refs_repo_id: Index on repo_id.

Table: git_repo

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the repository
repo_pathTEXTNOT NULLGit repo's absolute path under mega directory
repo_nameTEXTNOT NULLName of the repository
created_atTIMESTAMPNOT NULLTimestamp when the repository was created
updated_atTIMESTAMPNOT NULLTimestamp when the repository was last updated

Constraints

  • uniq_ir_path: Unique constraint on repo_path.

Indexes

  • idx_ir_repo_path: Index on repo_path.

Table: git_commit

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the commit
repo_idBIGINTNOT NULLRepository ID associated with the commit
commit_idVARCHAR(40)NOT NULLSHA-1 hash of the commit
treeVARCHAR(40)NOT NULLSHA-1 hash of the tree object
parents_idJSONNOT NULLParent commits in JSON format
authorTEXTAuthor details
committerTEXTCommitter details
contentTEXTCommit message
created_atTIMESTAMPNOT NULLTimestamp when the commit was created

Constraints

  • uniq_c_git_repo_id: Unique constraint on (repo_id, commit_id).

Indexes

  • idx_ic_git_id: Index on commit_id.
  • idx_ic_repo_id: Index on repo_id.

Table: git_tree

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the tree object
repo_idBIGINTNOT NULLRepository ID associated with the tree
tree_idVARCHAR(40)NOT NULLSHA-1 hash of the tree
sub_treesBYTEANOT NULLSerialized data of subtrees
sizeINTNOT NULLSize of the tree object
commit_idVARCHAR(40)NOT NULLCommit ID associated with the tree
created_atTIMESTAMPNOT NULLTimestamp when the tree was created

Constraints

  • uniq_t_git_repo: Unique constraint on (repo_id, tree_id).

Indexes

  • idx_t_git_id: Index on tree_id.
  • idx_t_repo_id: Index on repo_id.

Table: git_blob

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the blob
repo_idBIGINTNOT NULLRepository ID associated with the blob
blob_idVARCHAR(40)NOT NULLSHA-1 hash of the blob
nameVARCHAR(128)Name of the blob
sizeINTNOT NULLSize of the blob in bytes
commit_idVARCHAR(40)NOT NULLCommit ID associated with the blob
created_atTIMESTAMPNOT NULLTimestamp when the blob was created

Constraints

  • uniq_b_git_repo: Unique constraint on (repo_id, blob_id).

Indexes

  • idx_b_git_id: Index on blob_id.

Table: git_tag

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the tag
repo_idBIGINTNOT NULLRepository ID associated with the tag
tag_idVARCHAR(40)NOT NULLSHA-1 hash of the tag
object_idVARCHAR(40)NOT NULLObject ID that the tag points to
object_typeVARCHAR(20)NOT NULLType of the object (e.g., commit, tree, blob)
tag_nameTEXTNOT NULLName of the tag
taggerTEXTTagger details
messageTEXTMessage associated with the tag
created_atTIMESTAMPNOT NULLTimestamp when the tag was created

Constraints

  • uniq_gtag_tag_id: Unique constraint on tag_id.

Table: raw_blob

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the raw blob
sha1VARCHAR(40)NOT NULLSHA-1 hash of the raw blob
contentTEXTRaw content
file_typeVARCHAR(20)File type (e.g., text, binary)
storage_typeVARCHAR(20)NOT NULLStorage type (e.g., database, local-fs, remote_url )
dataBYTEABinary data
local_pathTEXTLocal storage path
remote_urlTEXTURL for remote storage
created_atTIMESTAMPNOT NULLTimestamp when the raw blob was created

Constraints

  • uniq_rb_sha1: Unique constraint on sha1.

Indexes

  • idx_rb_sha1: Index on sha1.

Table: git_pr

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the pull request
numberBIGINTNOT NULLPR number
titleVARCHAR(255)NOT NULLTitle of the PR
stateVARCHAR(255)NOT NULLState of the PR (e.g., open, closed)
created_atTIMESTAMPNOT NULLTimestamp when the PR was created
updated_atTIMESTAMPNOT NULLTimestamp when the PR was last updated
closed_atTIMESTAMPDEFAULT NULLTimestamp when the PR was closed
merged_atTIMESTAMPDEFAULT NULLTimestamp when the PR was merged
merge_commit_shaVARCHAR(200)DEFAULT NULLSHA of the merge commit
repo_idBIGINTNOT NULLRepository ID associated with the PR
sender_nameVARCHAR(255)NOT NULLName of the PR sender
sender_idBIGINTNOT NULLID of the PR sender
user_nameVARCHAR(255)NOT NULLUser name of the PR creator
user_idBIGINTNOT NULLUser ID of the PR creator
commits_urlVARCHAR(255)NOT NULLURL for commits associated with the PR
patch_urlVARCHAR(255)NOT NULLURL for the patch
head_labelVARCHAR(255)NOT NULLLabel for the source branch
head_refVARCHAR(255)NOT NULLReference to the source branch
base_labelVARCHAR(255)NOT NULLLabel for the target branch
base_refVARCHAR(255)NOT NULLReference to the target branch

Table: git_issue

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID for the issue
numberBIGINTNOT NULLIssue number
titleVARCHAR(255)NOT NULLTitle of the issue
sender_nameVARCHAR(255)NOT NULLName of the issue sender
sender_idBIGINTNOT NULLID of the issue sender
stateVARCHAR(255)NOT NULLState of the issue (e.g., open, closed)
created_atTIMESTAMPNOT NULLTimestamp when the issue was created
updated_atTIMESTAMPNOT NULLTimestamp when the issue was last updated
closed_atTIMESTAMPDEFAULT NULLTimestamp when the issue was closed
repo_idBIGINTNOT NULLRepository ID associated with the issue

Table: lfs_locks

ColumnTypeConstraintsDescription
idVARCHAR(40)PRIMARY KEYUnique ID for the lock
dataTEXTNOT NULLMetadata or lock data

Table: lfs_objects

ColumnTypeConstraintsDescription
oidVARCHAR(64)PRIMARY KEYUnique object ID (hash)
sizeBIGINTNOT NULLSize of the object in bytes
existBOOLEANNOT NULLIndicates if the object exists
splitedBOOLEANNOT NULLIndicates if the object is split

Table: lfs_split_relations

ColumnTypeConstraintsDescription
ori_oidVARCHAR(64)NOT NULLOriginal object ID
sub_oidVARCHAR(64)NOT NULLSub-object ID
offsetBIGINTNOT NULLOffset within the original object
sizeBIGINTNOT NULLSize of the sub-object

Constraints

  • Primary Key: Combination of ori_oid, sub_oid, and offset.

Table: lfs_split_relations

ColumnTypeConstraintsDescription
ori_oidVARCHAR(64)NOT NULL, PRIMARY KEYOriginal Object ID
sub_oidVARCHAR(64)NOT NULL, PRIMARY KEYSub Object ID
offsetBIGINTNOT NULL, PRIMARY KEYOffset of the sub-object
sizeBIGINTNOT NULLSize of the sub-object

Table: ztm_node

ColumnTypeConstraintsDescription
peer_idVARCHAR(64)PRIMARY KEYUnique Peer ID
hubVARCHAR(64)Hub information
agent_nameVARCHAR(64)Name of the agent
service_nameVARCHAR(64)Name of the service
typeVARCHAR(64)Node type
onlineBOOLEANNOT NULLNode online status
last_online_timeBIGINTNOT NULLLast online timestamp
service_portINTNOT NULLService port number

Table: ztm_repo_info

ColumnTypeConstraintsDescription
identifierVARCHAR(128)PRIMARY KEYUnique repository identifier
nameVARCHAR(64)Repository name
originVARCHAR(64)Origin information
update_timeBIGINTNOT NULLLast update timestamp
commitVARCHAR(64)Last commit hash

Table: ztm_lfs_info

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique identifier
file_hashVARCHAR(256)File hash
hash_typeVARCHAR(64)Type of hash (e.g., SHA256)
file_sizeBIGINTNOT NULLSize of the file
creation_timeBIGINTNOT NULLCreation timestamp
peer_idVARCHAR(64)Associated peer ID
originVARCHAR(256)File origin information

Table: ztm_nostr_event

ColumnTypeConstraintsDescription
idVARCHAR(128)PRIMARY KEYUnique event ID
pubkeyVARCHAR(128)Public key of the event
created_atBIGINTNOT NULLEvent creation timestamp
kindINTEvent type
tagsTEXTAssociated tags
contentTEXTEvent content
sigVARCHAR(256)Signature of the event

Table: ztm_nostr_req

ColumnTypeConstraintsDescription
idVARCHAR(128)PRIMARY KEYUnique request ID
subscription_idVARCHAR(128)Subscription identifier
filtersTEXTRequest filters

Table: mq_storage

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID
categoryVARCHAR(64)Category of the message
create_timeTIMESTAMPNOT NULLCreation timestamp
contentTEXTMessage content

Table: ztm_path_mapping

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID
aliasTEXTNOT NULL, UNIQUEAlias for the repository path
repo_pathTEXTNOT NULLRepository path
created_atTIMESTAMPNOT NULLCreation timestamp
updated_atTIMESTAMPNOT NULLLast update timestamp

Table: user

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID
nameTEXTNOT NULLName of the user
emailTEXTNOT NULL, UNIQUEEmail of the user
avatar_urlTEXTNOT NULLAvatar URL
is_githubBOOLEANNOT NULLIndicates if GitHub user
created_atTIMESTAMPNOT NULLCreation timestamp
updated_atTIMESTAMPLast update timestamp

Table: ssh_keys

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID
user_idBIGINTNOT NULL, INDEXAssociated user ID
titleTEXTNOT NULLTitle of the SSH key
ssh_keyTEXTNOT NULLPublic SSH key
fingerTEXTNOT NULL, INDEX (prefix: 8)SSH key fingerprint
created_atTIMESTAMPNOT NULLCreation timestamp

Table: access_token

ColumnTypeConstraintsDescription
idBIGINTPRIMARY KEYUnique ID
user_idBIGINTNOT NULL, INDEXAssociated user ID
tokenTEXTNOT NULL, INDEX (prefix: 8)Access token
created_atTIMESTAMPNOT NULLCreation timestamp

4. Prerequisites

  • Generating entities: Entities can be generated from the database table structure with the following command
sea-orm-cli generate entity -u "postgres://postgres:$postgres@localhost/mega"  -o jupiter/entity/src