k0.hswaw.net: pass metallb through Calico

Previously, we had the following setup:

                          .-----------.
                          | .....     |
                        .-----------.-|
                        | dcr01s24  | |
                      .-----------.-| |
                      | dcr01s22  | | |
                  .---|-----------| |-'
    .--------.    |   |---------. | |
    | dcsw01 | <----- | metallb | |-'
    '--------'        |---------' |
                      '-----------'

Ie., each metallb on each node directly talked to dcsw01 over BGP to
announce ExternalIPs to our L3 fabric.

Now, we rejigger the configuration to instead have Calico's BIRD
instances talk BGP to dcsw01, and have metallb talk locally to Calico.

                      .-------------------------.
                      | dcr01s24                |
                      |-------------------------|
    .--------.        |---------.   .---------. |
    | dcsw01 | <----- | Calico  |<--| metallb | |
    '--------'        |---------'   '---------' |
                      '-------------------------'

This makes Calico announce our pod/service networks into our L3 fabric!

Calico and metallb talk to eachother over 127.0.0.1 (they both run with
Host Networking), but that requires one side to flip to pasive mode. We
chose to do that with Calico, by overriding its BIRD config and
special-casing any 127.0.0.1 peer to enable passive mode.

We also override Calico's Other Bird Template (bird_ipam.cfg) to fiddle
with the kernel programming filter (ie. to-kernel-routing-table filter),
where we disable programming unreachable routes. This is because routes
coming from metallb have their next-hop set to 127.0.0.1, which makes
bird mark them as unreachable. Unreachable routes in the kernel will
break local access to ExternalIPs, eg. register access from containerd.

All routes pass through without route reflectors and a full mesh as we
use eBGP over private ASNs in our fabric.

We also have to make Calico aware of metallb pools - otherwise, routes
announced by metallb end up being filtered by Calico.

This is all mildly hacky. Here's hoping that Calico will be able to some
day gain metallb-like functionality, ie. IPAM for
externalIPs/LoadBalancers/...

There seems to be however one problem with this change (but I'm not
fixing it yet as it's not critical): metallb would previously only
announce IPs from nodes that were serving that service. Now, however,
the Calico internal mesh makes those appear from every node. This can
probably be fixed by disabling local meshing, enabling route reflection
on dcsw01 (to recreate the mesh routing through dcsw01). Or, maybe by
some more hacking of the Calico BIRD config :/.

Change-Id: I3df1f6ae7fa1911dd53956ced3b073581ef0e836
5 files changed
tree: 5295d92b4c50abf4d929b87fd92c53fe258d1dcc
  1. app/
  2. bgpwtf/
  3. bzl/
  4. cluster/
  5. dc/
  6. devtools/
  7. doc/
  8. games/
  9. gcp/
  10. go/
  11. hswaw/
  12. kube/
  13. ops/
  14. personal/
  15. third_party/
  16. tools/
  17. .bazelrc
  18. .gitignore
  19. BUILD
  20. COPYING
  21. env.fish
  22. env.sh
  23. hackdoc.toml
  24. OWNERS
  25. README.md
  26. WORKSPACE
README.md

hscloud is the main monorepo of the Warsaw Hackerspace infrastructure code.

Any time you see a //path/like/this, it refers to the root of hscloud, ie. the path path/like/this in this repository. Perforce and/or Bazel users should feel right at home.

Viewing this documentation

For a pleaseant web viewing experience, see this documentation in hackdoc. This will allow you to read this markdown file (and others) in a pretty, linkable view.

Getting started

See //doc/codelabs for tutorials on how to use hscloud.

If you want to browse the source of hscloud in a web browser, use cs.hackerspace.pl.

If you want some other help, talk to q3k, informatic or your therapist.

Directory Structure

Directories you should care about:

  • app: external services that we host that are somewhat universal: matrix, covid-formity, etc.
  • bgpwtf: code related to our little ISP
  • cluster: code related to our Kubernetes cluster (k0.hswaw.net)
  • dc: code related to datacenter automation
  • devtools: code related to developer tooling, like gerrit or hackdoc
  • doc: high-level documentation that doesn't fit anywhere else, ie. codelabs
  • hswaw: Warsaw Hackerspace specific/internal services. The line between this and app is unfortunately blurry.
  • personal: user's personal (experimental) directories
  • kube, go: code specific to languages but general to the whole of hscloud

Licensing

Unless noted otherwise, code in hscloud is licensed under the BSD 0-clause license - see COPYING.