logoalt Hacker News

cryptonectorlast Wednesday at 6:42 PM3 repliesview on HN

> "Why would you do it any other way" would be more interesting:

That's the interesting question. Normally a bug tracker would basically be a SQL application. When you move it into a Git repo you lose that and now you have to think about how to represent all that relational data in your repository. It gets annoying. This is why for Fossil it's such a trivial thing to do: Fossil repositories _are_ relational and hosted on an RDBMS (SQLite3 or PG). If you don't have a SQL then referential integrity is easy to break (e.g., issues that refer to others that don't know they're being referred to), and querying your issue database becomes a problem as it gets huge because Git doesn't really have an appropriate index for this.

What one might do to alleviate the relational issues is to just not try to maintain referential integrity but instead suck up the issues from Git into a local SQLite3 DB. Then as long as there are no non-fast-forward pushes to the issues DB it's always easy to catch up and have a functional relational database.


Replies

Git-Masterlast Wednesday at 9:15 PM

Two corrections:

1. Fossil repositories are explicitly not relational, they are however stored in SQLite databases. The data model for everything SCM-relevant (that also includes all content like tickets, wiki, forum) is stored as artifacts in a blob table (+ delta), which references other artifacts by hash value, and that provides the referential integrity. That, and the code that handles it. There are relations (via auxiliary tables) to speed up queries, but these tables are transient, get updated by inserting new artifacts, and can be regenerated from the artifacts.

(Users and their metadata, and configuration is not part of this scheme, so these tables might be viewed as relational tables. They are local-only; and not synced.)

See https://fossil-scm.org/home/doc/trunk/www/fossil-is-not-rela... and https://fossil-scm.org/home/doc/tip/www/theory1.wiki for more details.

2. There are no other databases like PostgreSQL to choose from.

show 1 reply
sshineyesterday at 10:25 AM

> Normally a bug tracker would basically be a SQL application. When you move it into a Git repo you lose that and now you have to think about how to represent all that relational data in your repository. It gets annoying.

> just not try to maintain referential integrity but instead suck up the issues from Git into a local SQLite3 DB

Applications with SQL backends tackle the problem “how can we express our CRUD datastructures so they fit the relational model?

When you replace SQL with a file-based serialised format, you mainly lose ACID. This is arguably easier than SQL.

You are a single user locally: as long as your IDE plugin doesn’t fight with your terminal client, your need for ACID are low: you handle conflicts like git conflicts, but you may apply domain-specific conflict resolution; while conflicts in source code has no general resolution strategy, merging issue comments can be easy; changing issue status when people agree can be automatically resolved; handling conflicts manually can be tool-assisted.

So losing SQL is not that big of a deal, assuming you’re in a highly decentralised, highly async environment.

The answer to “Why though?” should be forge interop, keeping knowledge in-repo, and because it may fit the organisation (or lack of), e.g. just like you can commit code changes offline, you’re not prevented from updating issues when you’re offline.

show 1 reply
jFriedensreichlast Wednesday at 9:26 PM

I don't think that is the main issue. Its not THAT hard to build a secondary sqlite index for data stored in git and keep in in sync especially because the data is already versioned and can be incrementally updated.