8000 Consolidate IPFS Repositories · Issue #8543 · ipfs/kubo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Consolidate IPFS Repositories #8543
Closed
Closed
@guseggert

Description

@guseggert

Description

Problem

The go-ipfs dependency closure includes 47 modules under github.com/ipfs. Here are their interdependencies (this does not include libp2p nor other PL orgs):

deps

Pain

  • Changes must be propagated across many repos, in the right order
  • Repos are best-effort to maintain and keep up-to-date, leading to complex dependency graphs due to different versions floating around
  • It's difficult to get feedback about whether a change is safe for consumers of the code, due to being in different repos w/ different CI
  • In some cases, this discourages experimentation since it can be hard to bubble changes up to end-user applications like go-ipfs

Current Desirable Properties

  • Experimental code can easily mix-and-match functionality from go-ipfs
  • The dependency graph of the consumer does not include every transitive dependency of go-ipfs

Why are repos structured this way?

The intention of the current layout was to encourage flexibility, extensibility, and experimentation. Functionality of IPFS could be reused in other projects without depending on IPFS as a whole.

Also these repos predate most Go tooling.

How much does a repo cost?

Repo maintenance costs include:

  • Keeping dependencies up-to-date
    • This is non-trivial as it often requires chasing down other dependencies in the dependency graph...mostly we don't do this until we have to
  • Releasing new versions as necessary
  • Making sure CI is still working
  • Migrating from Travis/CircleCI to Actions (still in progress)
  • Rolling out unified CI
  • Backporting changes across major versions as necessary
  • Manually testing impact of new code changes on downstream consumers
  • Monitoring issue trackers, PRs, etc.
  • Updating submodules
    • Commonly used for testing, example code, etc.
    • Often these contain circular module dependencies which complicate propagating breaking changes

Why now? What's changed?

We have an increasing amount of:

  • Repos
    • See maintenance costs above
    • Some are in various states of deprecation, which adds to the maintenance costs and the cost of implementing new features
    • Some don't build due to flaky tests, with not enough incentive to fix them until it becomes a blocker
  • Projects
    • Often these result in backwards-incompatible changes, sometimes even new major versions, which then need to be propagated around to all the downstream repos
      • finding those repos can be difficult (e.g. backporting across versions, in-flight work, etc.)
    • Increase in # in-flight projects means we're more likely to have repos in transient broken states which block/slow the progress of other projects (this happens often)

Also, Go modules now exist, along with module graph pruning. The latter is key to preventing consumers from having an explosion of transient dependencies if they just want to reuse some small piece of code.

How can we consolidate repos? What's the ideal end state?

We want our repo layout to facilitate day-to-day development, while also letting us reuse components and functionality. Code that is commonly changed and built together should be in the same repo (as much as possible), so that it can be tested and released together.

We can leverage some of the new tooling around Go modules to retain the flexibility of separate repos, without having to pay the significant cost.

The ideal repo layout:

  • go-libipfs
    • Roll up most repos that start with github.com/ipfs/go-*
    • Build produces no binaries
    • Contains no Go submodules
    • Includes all supported "official" interfaces and implementations
      • Unsupported and experimental code can live elsewhere, once they "graduate" they are moved into the go-libipfs repo for long-term maintenance
    • High code quality bar
      • Careful consideration of cross-package dependencies
    • Consumes other libs like IPLD, multiformats modules, libp2p, etc.
  • go-libdatastore
    • Datastore interfaces and supported implementations
    • This is its own repo to avoid circular dependencies with libp2p
    • TODO can libp2p be refactored to remove the circular dependency? Also, Go tolerates circular module dependencies, so why specifically is that bad?
      • (list of reasons added by mvdan)
        • Impossible to require one module without the other, in either direction.
        • Updating both modules becomes a trickier dance: modify A, modify B, update A's dependency on B, update B's dependency on A
        • The module dependency graph becomes a "downward spiral" bouncing between A and B, meaning your dependency graph will grow over time
  • go-ipfs
    • Thin layer that consumes go-libipfs and produces the ipfs binary
    • Could be some other name for the Go IPFS implementation
  • go-ipfs-gateway
    • Experimental gateway implementation that also consumes go-libipfs

Other consumers of go-libipfs include libp2p (datastore) and Filecoin and IPFS cluster and ipfs-lite, and the IPFS examples.

What about consumers of repos we want to remove/archive? How do we roll this out?

go-libp2p did something similar a couple years ago, largely avoiding breaking consumers by shimming out existing repos to point to the consolidated one, example: https://github.com/libp2p/go-libp2p-protocol/blob/master/protocol.go

We can use this same trick to incrementally consolidate without breaking consumers.

See e.g. this PoC of moving go-namesys into go-ipfs while preserving backwards compatibility (in reality we'd move it to go-libipfs):

There may be some cases where this isn't possible without breaking changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0