cluster/kube: split up cluster.jsonnet It was getting large and unwieldy (to the point where kubecfg was slow). In this change, we: - move the Cluster function to cluster.libsonnet - move the Cluster instantiation into k0.libsonnet - shuffle some fields around to make sure things are well split between k0-specific and general cluster configs. - add 'view' files that build on 'cluster.libsonnet' to allow rendering either the entire k0 state, or some subsets (for speed) - update the documentation, drive-by some small fixes and reindantation Change-Id: I4b8d920b600df79100295267efe21b8c82699d5b

commit: dbfa988c73a39403a5eceb1f2162b02be903fa13 [log] [tgz]
author: Sergiusz Bazanski <q3k@hackerspace.pl> Sat Jun 06 01:21:45 2020 +0200
committer: Sergiusz Bazanski <q3k@hackerspace.pl> Sat Jun 13 19:51:58 2020 +0200
tree: b926aa28e020102aaf802aac46c3a9deadb74ee9
parent: 30f9d03106e65ede9c54f84463999fc09700860e [diff]
diff --git a/cluster/doc/admin.md b/cluster/doc/admin.md
index 27b30ca..1dfb50a 100644
--- a/cluster/doc/admin.md
+++ b/cluster/doc/admin.md

@@ -8,7 +8,9 @@
 Persistent Storage (waw2)
 -------------------------
 
-HDDs on bc01n0{1-3}. 3TB total capacity. Don't use this as this pool should go away soon (the disks are slow, the network is slow and the RAID controllers lie). Use ceph-waw3 instead.
+HDDs on bc01n0{1-3}. 3TB total capacity. Don't use this as this pool should go
+away soon (the disks are slow, the network is slow and the RAID controllers
+lie). Use ceph-waw3 instead.
 
 The following storage classes use this cluster:
 
@@ -17,9 +19,12 @@
  - `waw-hdd-yolo-1` - unreplicated (you _will_ lose your data)
  - `waw-hdd-redundant-1-object` - erasure coded 2.1 object store
 
-Rados Gateway (S3) is available at https://object.ceph-waw2.hswaw.net/. To create a user, ask an admin.
+Rados Gateway (S3) is available at https://object.ceph-waw2.hswaw.net/. To
+create a user, ask an admin.
 
-PersistentVolumes currently bound to PersistentVolumeClaims get automatically backed up (hourly for the next 48 hours, then once every 4 weeks, then once every month for a year).
+PersistentVolumes currently bound to PersistentVolumeClaims get automatically
+backed up (hourly for the next 48 hours, then once every 4 weeks, then once
+every month for a year).
 
 Persistent Storage (waw3)
 -------------------------
@@ -32,9 +37,12 @@
  - `waw-hdd-redundant-3` - 2 replicas
  - `waw-hdd-redundant-3-object` - 2 replicas, object store
 
-Rados Gateway (S3) is available at https://object.ceph-waw3.hswaw.net/. To create a user, ask an admin.
+Rados Gateway (S3) is available at https://object.ceph-waw3.hswaw.net/. To
+create a user, ask an admin.
 
-PersistentVolumes currently bound to PVCs get automatically backed up (hourly for the next 48 hours, then once every 4 weeks, then once every month for a year).
+PersistentVolumes currently bound to PVCs get automatically backed up (hourly
+for the next 48 hours, then once every 4 weeks, then once every month for a
+year).
 
 Administration
 ==============
@@ -42,25 +50,55 @@
 Provisioning nodes
 ------------------
 
- - bring up a new node with nixos, the configuration doesn't matter and will be nuked anyway
+ - bring up a new node with nixos, the configuration doesn't matter and will be
+   nuked anyway
  - edit cluster/nix/defs-machines.nix
  - `bazel run //cluster/clustercfg nodestrap bc01nXX.hswaw.net`
 
+Applying kubecfg state
+----------------------
+
+First, decrypt/sync all secrets:
+
+    secretstore sync cluster/secrets/
+
+Then, run kubecfg. There's multiple top-level 'view' files that you can run,
+all located in `//cluster/kube`.  All of them use `k0.libsonnet` as the master
+state of Kubernetes configuration, just expose subsets of it to work around the
+fact that kubecfg gets somewhat slow with a lot of resources.
+
+ - `k0.jsonnet`: everything that is defined for k0 in `//cluster/kube/...`.
+ - `k0-core.jsonnet`: definitions that re in common across all clusters
+   (networking, registry, etc), without Rook.
+ - `k0-registry.jsonnet`: just the docker registry on k0 (useful when changing
+   ACLs).
+ - `k0-ceph.jsonnet`: everything ceph/rook related on k0.
+
+When in doubt, run `k0.jsonnet`. There's no harm in doing it, it might just be
+slow. Running individual files without realizing that whatever change you
+implemented also influenced something that was rendered in another file can
+cause to production inconsistencies.
+
+Feel free to add more view files for typical administrative tasks.
+
 Ceph - Debugging
 -----------------
 
-We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system` namespace. To debug Ceph issues, start by looking at its logs.
+We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system`
+namespace. To debug Ceph issues, start by looking at its logs.
 
-A dashboard is available at https://ceph-waw2.hswaw.net/ and https://ceph-waw3.hswaw.net, to get the admin password run:
+A dashboard is available at https://ceph-waw2.hswaw.net/ and
+https://ceph-waw3.hswaw.net, to get the admin password run:
 
     kubectl -n ceph-waw2 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
-    kubectl -n ceph-waw2 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
+    kubectl -n ceph-waw3 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
 
 
 Ceph - Backups
 --------------
 
-Kubernetes PVs backed in Ceph RBDs get backed up using Benji. An hourly cronjob runs in every Ceph cluster. You can also manually trigger a run by doing:
+Kubernetes PVs backed in Ceph RBDs get backed up using Benji. An hourly cronjob
+runs in every Ceph cluster. You can also manually trigger a run by doing:
 
     kubectl -n ceph-waw2 create job --from=cronjob/ceph-waw2-benji ceph-waw2-benji-manual-$(date +%s)
     kubectl -n ceph-waw3 create job --from=cronjob/ceph-waw3-benji ceph-waw3-benji-manual-$(date +%s)
@@ -70,10 +108,12 @@
 Ceph - Object Storage
 ---------------------
 
-To create an object store user consult rook.io manual (https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html)
-User authentication secret is generated in ceph cluster namespace (`ceph-waw2`),
-thus may need to be manually copied into application namespace. (see `app/registry/prod.jsonnet` comment)
+To create an object store user consult rook.io manual
+(https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html).
+User authentication secret is generated in ceph cluster namespace
+(`ceph-waw{2,3}`), thus may need to be manually copied into application namespace.
+(see `app/registry/prod.jsonnet` comment)
 
-`tools/rook-s3cmd-config` can be used to generate test configuration file for s3cmd.
-Remember to append `:default-placement` to your region name (ie. `waw-hdd-redundant-1-object:default-placement`)
-
+`tools/rook-s3cmd-config` can be used to generate test configuration file for
+s3cmd.  Remember to append `:default-placement` to your region name (ie.
+`waw-hdd-redundant-3-object:default-placement`)

diff --git a/cluster/kube/cluster.jsonnet b/cluster/kube/cluster.jsonnet
deleted file mode 100644
index 02099cd..0000000
--- a/cluster/kube/cluster.jsonnet
+++ /dev/null

@@ -1,519 +0,0 @@
-# Top level cluster configuration.
-
-local kube = import "../../kube/kube.libsonnet";
-local policies = import "../../kube/policies.libsonnet";
-
-local calico = import "lib/calico.libsonnet";
-local certmanager = import "lib/cert-manager.libsonnet";
-local cockroachdb = import "lib/cockroachdb.libsonnet";
-local coredns = import "lib/coredns.libsonnet";
-local metallb = import "lib/metallb.libsonnet";
-local metrics = import "lib/metrics.libsonnet";
-local nginx = import "lib/nginx.libsonnet";
-local prodvider = import "lib/prodvider.libsonnet";
-local registry = import "lib/registry.libsonnet";
-local rook = import "lib/rook.libsonnet";
-local pki = import "lib/pki.libsonnet";
-
-local Cluster(short, realm) = {
-    local cluster = self,
-    local cfg = cluster.cfg,
-
-    short:: short,
-    realm:: realm,
-    fqdn:: "%s.%s" % [cluster.short, cluster.realm],
-
-    cfg:: {
-        // Storage class used for internal services (like registry). This must
-        // be set to a valid storage class. This can either be a cloud provider class
-        // (when running on GKE &co) or a storage class created using rook.
-        storageClassNameRedundant: error "storageClassNameRedundant must be set",
-    },
-
-    // These are required to let the API Server contact kubelets.
-    crAPIServerToKubelet: kube.ClusterRole("system:kube-apiserver-to-kubelet") {
-        metadata+: {
-            annotations+: {
-                "rbac.authorization.kubernetes.io/autoupdate": "true",
-            },
-            labels+: {
-                "kubernetes.io/bootstrapping": "rbac-defaults",
-            },
-        },
-        rules: [
-            {
-                apiGroups: [""],
-                resources: ["nodes/%s" % r for r in [ "proxy", "stats", "log", "spec", "metrics" ]],
-                verbs: ["*"],
-            },
-        ],
-    },
-    crbAPIServer: kube.ClusterRoleBinding("system:kube-apiserver") {
-        roleRef: {
-            apiGroup: "rbac.authorization.k8s.io",
-            kind: "ClusterRole",
-            name: cluster.crAPIServerToKubelet.metadata.name,
-        },
-        subjects: [
-            {
-                apiGroup: "rbac.authorization.k8s.io",
-                kind: "User",
-                # A cluster API Server authenticates with a certificate whose CN is == to the FQDN of the cluster.
-                name: cluster.fqdn,
-            },
-        ],
-    },
-
-    // This ClusteRole is bound to all humans that log in via prodaccess/prodvider/SSO.
-    // It should allow viewing of non-sensitive data for debugability and openness.
-    crViewer: kube.ClusterRole("system:viewer") {
-        rules: [
-            {
-                apiGroups: [""],
-                resources: [
-                    "nodes",
-                    "namespaces",
-                    "pods",
-                    "configmaps",
-                    "services",
-                ],
-                verbs: ["list"],
-            },
-            {
-                apiGroups: ["metrics.k8s.io"],
-                resources: [
-                    "nodes",
-                    "pods",
-                ],
-                verbs: ["list"],
-            },
-            {
-                apiGroups: ["apps"],
-                resources: [
-                    "statefulsets",
-                ],
-                verbs: ["list"],
-            },
-            {
-                apiGroups: ["extensions"],
-                resources: [
-                    "deployments",
-                    "ingresses",
-                ],
-                verbs: ["list"],
-            }
-        ],
-    },
-    // This ClusterRole is applied (scoped to personal namespace) to all humans.
-    crFullInNamespace: kube.ClusterRole("system:admin-namespace") {
-        rules: [
-            {
-                apiGroups: ["", "extensions", "apps"],
-                resources: ["*"],
-                verbs: ["*"],
-            },
-            {
-                apiGroups: ["batch"],
-                resources: ["jobs", "cronjobs"],
-                verbs: ["*"],
-            },
-        ],
-    },
-    // This ClusterRoleBindings allows root access to cluster admins.
-    crbAdmins: kube.ClusterRoleBinding("system:admins") {
-        roleRef: {
-            apiGroup: "rbac.authorization.k8s.io",
-            kind: "ClusterRole",
-            name: "cluster-admin",
-        },
-        subjects: [
-            {
-                apiGroup: "rbac.authorization.k8s.io",
-                kind: "User",
-                name: user + "@hackerspace.pl",
-            } for user in [
-                "q3k",
-                "implr",
-                "informatic",
-            ]
-        ],
-    },
-
-    podSecurityPolicies: policies.Cluster {},
-
-    allowInsecureNamespaces: [
-        policies.AllowNamespaceInsecure("kube-system"),
-        policies.AllowNamespaceInsecure("metallb-system"),
-        # TODO(q3k): fix this?
-        policies.AllowNamespaceInsecure("ceph-waw2"),
-        policies.AllowNamespaceInsecure("ceph-waw3"),
-        policies.AllowNamespaceInsecure("matrix"),
-        policies.AllowNamespaceInsecure("registry"),
-        policies.AllowNamespaceInsecure("internet"),
-        # TODO(implr): restricted policy with CAP_NET_ADMIN and tuntap, but no full root
-        policies.AllowNamespaceInsecure("implr-vpn"),
-    ],
-
-    // Allow all service accounts (thus all controllers) to create secure pods.
-    crbAllowServiceAccountsSecure: kube.ClusterRoleBinding("policy:allow-all-secure") {
-        roleRef_: cluster.podSecurityPolicies.secureRole,
-        subjects: [
-            {
-                kind: "Group",
-                apiGroup: "rbac.authorization.k8s.io",
-                name: "system:serviceaccounts",
-            }
-        ],
-    },
-
-    // Calico network fabric
-    calico: calico.Environment {},
-    // CoreDNS for this cluster.
-    dns: coredns.Environment {
-        cfg+: {
-            cluster_domains: [
-                "cluster.local",
-                cluster.fqdn,
-            ],
-        },
-    },
-    // Metrics Server
-    metrics: metrics.Environment {},
-    // Metal Load Balancer
-    metallb: metallb.Environment {
-        cfg+: {
-            peers: [
-                {
-                    "peer-address": "185.236.240.33",
-                    "peer-asn": 65001,
-                    "my-asn": 65002,
-                },
-            ],
-            addressPools: [
-                {
-                    name: "public-v4-1",
-                    protocol: "bgp",
-                    addresses: [
-                        "185.236.240.48/28",
-                    ],
-                },
-                {
-                    name: "public-v4-2",
-                    protocol: "bgp",
-                    addresses: [
-                        "185.236.240.112/28"
-                    ],
-                },
-            ],
-        },
-    },
-    // Main nginx Ingress Controller
-    nginx: nginx.Environment {},
-    certmanager: certmanager.Environment {},
-    issuer: kube.ClusterIssuer("letsencrypt-prod") {
-        spec: {
-            acme: {
-                server: "https://acme-v02.api.letsencrypt.org/directory",
-                email: "bofh@hackerspace.pl",
-                privateKeySecretRef: {
-                    name: "letsencrypt-prod"
-                },
-                http01: {},
-            },
-        },
-    },
-
-    // Rook Ceph storage
-    rook: rook.Operator {
-        operator+: {
-            spec+: {
-                // TODO(q3k): Bring up the operator again when stability gets fixed
-                // See: https://github.com/rook/rook/issues/3059#issuecomment-492378873
-                replicas: 1,
-            },
-        },
-    },
-
-    // Docker registry
-    registry: registry.Environment {
-        cfg+: {
-            domain: "registry.%s" % [cluster.fqdn],
-            storageClassName: cfg.storageClassNameParanoid,
-            objectStorageName: "waw-hdd-redundant-2-object",
-        },
-    },
-
-    // TLS PKI machinery
-    pki: pki.Environment(cluster.short, cluster.realm),
-
-    // Prodvider
-    prodvider: prodvider.Environment {
-        cfg+: {
-            apiEndpoint: "kubernetes.default.svc.%s" % [cluster.fqdn],
-        },
-    },
-};
-
-
-{
-    k0: {
-        local k0 = self,
-        cluster: Cluster("k0", "hswaw.net") {
-            cfg+: {
-                storageClassNameParanoid: k0.ceph.waw2Pools.blockParanoid.name,
-            },
-        },
-        cockroach: {
-            waw2: cockroachdb.Cluster("crdb-waw1") {
-                cfg+: {
-                    topology: [
-                        { name: "bc01n01", node: "bc01n01.hswaw.net" },
-                        { name: "bc01n02", node: "bc01n02.hswaw.net" },
-                        { name: "bc01n03", node: "bc01n03.hswaw.net" },
-                    ],
-                    hostPath: "/var/db/crdb-waw1",
-                },
-            },
-            clients: {
-                cccampix: k0.cockroach.waw2.Client("cccampix"),
-                cccampixDev: k0.cockroach.waw2.Client("cccampix-dev"),
-                buglessDev: k0.cockroach.waw2.Client("bugless-dev"),
-                sso: k0.cockroach.waw2.Client("sso"),
-            },
-        },
-        ceph: {
-            // waw1 cluster - dead as of 2019/08/06, data corruption
-            // waw2 cluster
-            waw2: rook.Cluster(k0.cluster.rook, "ceph-waw2") {
-                spec: {
-                    mon: {
-                        count: 3,
-                        allowMultiplePerNode: false,
-                    },
-                    storage: {
-                        useAllNodes: false,
-                        useAllDevices: false,
-                        config: {
-                            databaseSizeMB: "1024",
-                            journalSizeMB: "1024",
-                        },
-                        nodes: [
-                            {
-                                name: "bc01n01.hswaw.net",
-                                location: "rack=dcr01 chassis=bc01 host=bc01n01",
-                                devices: [ { name: "sda" } ],
-                            },
-                            {
-                                name: "bc01n02.hswaw.net",
-                                location: "rack=dcr01 chassis=bc01 host=bc01n02",
-                                devices: [ { name: "sda" } ],
-                            },
-                            {
-                                name: "bc01n03.hswaw.net",
-                                location: "rack=dcr01 chassis=bc01 host=bc01n03",
-                                devices: [ { name: "sda" } ],
-                            },
-                        ],
-                    },
-                    benji:: {
-                        metadataStorageClass: "waw-hdd-paranoid-2",
-                        encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
-                        pools: [
-                            "waw-hdd-redundant-2",
-                            "waw-hdd-redundant-2-metadata",
-                            "waw-hdd-paranoid-2",
-                            "waw-hdd-yolo-2",
-                        ],
-                        s3Configuration: {
-                            awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
-                            awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
-                            bucketName: "benji-k0-backups",
-                            endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
-                        },
-                    }
-                },
-            },
-            waw2Pools: {
-                // redundant block storage
-                blockRedundant: rook.ECBlockPool(k0.ceph.waw2, "waw-hdd-redundant-2") {
-                    spec: {
-                        failureDomain: "host",
-                        erasureCoded: {
-                            dataChunks: 2,
-                            codingChunks: 1,
-                        },
-                    },
-                },
-                // paranoid block storage (3 replicas)
-                blockParanoid: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-paranoid-2") {
-                    spec: {
-                        failureDomain: "host",
-                        replicated: {
-                            size: 3,
-                        },
-                    },
-                },
-                // yolo block storage (no replicas!)
-                blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-yolo-2") {
-                    spec: {
-                        failureDomain: "host",
-                        replicated: {
-                            size: 1,
-                        },
-                    },
-                },
-                objectRedundant: rook.S3ObjectStore(k0.ceph.waw2, "waw-hdd-redundant-2-object") {
-                    spec: {
-                        metadataPool: {
-                            failureDomain: "host",
-                            replicated: { size: 3 },
-                        },
-                        dataPool: {
-                            failureDomain: "host",
-                            erasureCoded: {
-                                dataChunks: 2,
-                                codingChunks: 1,
-                            },
-                        },
-                    },
-                },
-            },
-            waw3: rook.Cluster(k0.cluster.rook, "ceph-waw3") {
-                spec: {
-                    mon: {
-                        count: 3,
-                        allowMultiplePerNode: false,
-                    },
-                    storage: {
-                        useAllNodes: false,
-                        useAllDevices: false,
-                        config: {
-                            databaseSizeMB: "1024",
-                            journalSizeMB: "1024",
-                        },
-                        nodes: [
-                            {
-                                name: "dcr01s22.hswaw.net",
-                                location: "rack=dcr01 host=dcr01s22",
-                                devices: [
-                                    // https://github.com/rook/rook/issues/1228
-                                    //{ name: "disk/by-id/wwan-0x" + wwan }
-                                    //for wwan in [
-                                    //    "5000c5008508c433",
-                                    //    "5000c500850989cf",
-                                    //    "5000c5008508f843",
-                                    //    "5000c5008508baf7",
-                                    //]
-                                    { name: "sdn" },
-                                    { name: "sda" },
-                                    { name: "sdb" },
-                                    { name: "sdc" },
-                                ],
-                            },
-                            {
-                                name: "dcr01s24.hswaw.net",
-                                location: "rack=dcr01 host=dcr01s22",
-                                devices: [
-                                    // https://github.com/rook/rook/issues/1228
-                                    //{ name: "disk/by-id/wwan-0x" + wwan }
-                                    //for wwan in [
-                                    //    "5000c5008508ee03",
-                                    //    "5000c5008508c9ef",
-                                    //    "5000c5008508df33",
-                                    //    "5000c5008508dd3b",
-                                    //]
-                                    { name: "sdm" },
-                                    { name: "sda" },
-                                    { name: "sdb" },
-                                    { name: "sdc" },
-                                ],
-                            },
-                        ],
-                    },
-                    benji:: {
-                        metadataStorageClass: "waw-hdd-redundant-3",
-                        encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
-                        pools: [
-                            "waw-hdd-redundant-3",
-                            "waw-hdd-redundant-3-metadata",
-                            "waw-hdd-yolo-3",
-                        ],
-                        s3Configuration: {
-                            awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
-                            awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
-                            bucketName: "benji-k0-backups-waw3",
-                            endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
-                        },
-                    }
-                },
-            },
-            waw3Pools: {
-                // redundant block storage
-                blockRedundant: rook.ECBlockPool(k0.ceph.waw3, "waw-hdd-redundant-3") {
-                    metadataReplicas: 2,
-                    spec: {
-                        failureDomain: "host",
-                        replicated: {
-                          size: 2,
-                        },
-                    },
-                },
-                // yolo block storage (low usage, no host redundancy)
-                blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw3, "waw-hdd-yolo-3") {
-                    spec: {
-                        failureDomain: "osd",
-                        erasureCoded: {
-                            dataChunks: 12,
-                            codingChunks: 4,
-                        },
-                    },
-                },
-                objectRedundant: rook.S3ObjectStore(k0.ceph.waw3, "waw-hdd-redundant-3-object") {
-                    spec: {
-                        metadataPool: {
-                            failureDomain: "host",
-                            replicated: { size: 2 },
-                        },
-                        dataPool: {
-                            failureDomain: "host",
-                            replicated: { size: 2 },
-                        },
-                    },
-                },
-            },
-        },
-
-        # Used for owncloud.hackerspace.pl, which for now lices on boston-packets.hackerspace.pl.
-        nextcloudWaw3: kube.CephObjectStoreUser("nextcloud") {
-            metadata+: {
-                namespace: "ceph-waw3",
-            },
-            spec: {
-                store: "waw-hdd-redundant-3-object",
-                displayName: "nextcloud",
-            },
-        },
-
-        # nuke@hackerspace.pl's personal storage.
-        nukePersonalWaw3: kube.CephObjectStoreUser("nuke-personal") {
-            metadata+: {
-                namespace: "ceph-waw3",
-            },
-            spec: {
-                store: "waw-hdd-redundant-3-object",
-                displayName: "nuke-personal",
-            },
-        },
-
-        # patryk@hackerspace.pl's ArmA3 mod bucket.
-        cz2ArmaModsWaw3: kube.CephObjectStoreUser("cz2-arma3mods") {
-            metadata+: {
-                namespace: "ceph-waw3",
-            },
-            spec: {
-                store: "waw-hdd-redundant-3-object",
-                displayName: "cz2-arma3mods",
-            },
-        },
-    },
-}

diff --git a/cluster/kube/cluster.libsonnet b/cluster/kube/cluster.libsonnet
new file mode 100644
index 0000000..c42ee8a
--- /dev/null
+++ b/cluster/kube/cluster.libsonnet

@@ -0,0 +1,221 @@
+# Common cluster configuration.
+# This defines what Kubernetes resources are required to turn a bare k8s
+# deployment into a fully working cluster.
+# These assume that you're running on bare metal, and using the corresponding
+# NixOS deployment that we do.
+
+local kube = import "../../kube/kube.libsonnet";
+local policies = import "../../kube/policies.libsonnet";
+
+local calico = import "lib/calico.libsonnet";
+local certmanager = import "lib/cert-manager.libsonnet";
+local coredns = import "lib/coredns.libsonnet";
+local metallb = import "lib/metallb.libsonnet";
+local metrics = import "lib/metrics.libsonnet";
+local nginx = import "lib/nginx.libsonnet";
+local prodvider = import "lib/prodvider.libsonnet";
+local rook = import "lib/rook.libsonnet";
+local pki = import "lib/pki.libsonnet";
+
+{
+    Cluster(short, realm):: {
+        local cluster = self,
+        local cfg = cluster.cfg,
+
+        short:: short,
+        realm:: realm,
+        fqdn:: "%s.%s" % [cluster.short, cluster.realm],
+
+        cfg:: {
+            // Storage class used for internal services (like registry). This must
+            // be set to a valid storage class. This can either be a cloud provider class
+            // (when running on GKE &co) or a storage class created using rook.
+            storageClassNameRedundant: error "storageClassNameRedundant must be set",
+        },
+
+        // These are required to let the API Server contact kubelets.
+        crAPIServerToKubelet: kube.ClusterRole("system:kube-apiserver-to-kubelet") {
+            metadata+: {
+                annotations+: {
+                    "rbac.authorization.kubernetes.io/autoupdate": "true",
+                },
+                labels+: {
+                    "kubernetes.io/bootstrapping": "rbac-defaults",
+                },
+            },
+            rules: [
+                {
+                    apiGroups: [""],
+                    resources: ["nodes/%s" % r for r in [ "proxy", "stats", "log", "spec", "metrics" ]],
+                    verbs: ["*"],
+                },
+            ],
+        },
+        crbAPIServer: kube.ClusterRoleBinding("system:kube-apiserver") {
+            roleRef: {
+                apiGroup: "rbac.authorization.k8s.io",
+                kind: "ClusterRole",
+                name: cluster.crAPIServerToKubelet.metadata.name,
+            },
+            subjects: [
+                {
+                    apiGroup: "rbac.authorization.k8s.io",
+                    kind: "User",
+                    # A cluster API Server authenticates with a certificate whose CN is == to the FQDN of the cluster.
+                    name: cluster.fqdn,
+                },
+            ],
+        },
+
+        // This ClusterRole is bound to all humans that log in via prodaccess/prodvider/SSO.
+        // It should allow viewing of non-sensitive data for debugability and openness.
+        crViewer: kube.ClusterRole("system:viewer") {
+            rules: [
+                {
+                    apiGroups: [""],
+                    resources: [
+                        "nodes",
+                        "namespaces",
+                        "pods",
+                        "configmaps",
+                        "services",
+                    ],
+                    verbs: ["list"],
+                },
+                {
+                    apiGroups: ["metrics.k8s.io"],
+                    resources: [
+                        "nodes",
+                        "pods",
+                    ],
+                    verbs: ["list"],
+                },
+                {
+                    apiGroups: ["apps"],
+                    resources: [
+                        "statefulsets",
+                    ],
+                    verbs: ["list"],
+                },
+                {
+                    apiGroups: ["extensions"],
+                    resources: [
+                        "deployments",
+                        "ingresses",
+                    ],
+                    verbs: ["list"],
+                }
+            ],
+        },
+        // This ClusterRole is applied (scoped to personal namespace) to all humans.
+        crFullInNamespace: kube.ClusterRole("system:admin-namespace") {
+            rules: [
+                {
+                    apiGroups: ["", "extensions", "apps"],
+                    resources: ["*"],
+                    verbs: ["*"],
+                },
+                {
+                    apiGroups: ["batch"],
+                    resources: ["jobs", "cronjobs"],
+                    verbs: ["*"],
+                },
+            ],
+        },
+        // This ClusterRoleBindings allows root access to cluster admins.
+        crbAdmins: kube.ClusterRoleBinding("system:admins") {
+            roleRef: {
+                apiGroup: "rbac.authorization.k8s.io",
+                kind: "ClusterRole",
+                name: "cluster-admin",
+            },
+            subjects: [
+                {
+                    apiGroup: "rbac.authorization.k8s.io",
+                    kind: "User",
+                    name: user + "@hackerspace.pl",
+                } for user in [
+                    "q3k",
+                    "implr",
+                    "informatic",
+                ]
+            ],
+        },
+
+        podSecurityPolicies: policies.Cluster {},
+
+        allowInsecureNamespaces: [
+            policies.AllowNamespaceInsecure("kube-system"),
+            policies.AllowNamespaceInsecure("metallb-system"),
+        ],
+
+        // Allow all service accounts (thus all controllers) to create secure pods.
+        crbAllowServiceAccountsSecure: kube.ClusterRoleBinding("policy:allow-all-secure") {
+            roleRef_: cluster.podSecurityPolicies.secureRole,
+            subjects: [
+                {
+                    kind: "Group",
+                    apiGroup: "rbac.authorization.k8s.io",
+                    name: "system:serviceaccounts",
+                }
+            ],
+        },
+
+        // Calico network fabric
+        calico: calico.Environment {},
+
+        // CoreDNS for this cluster.
+        dns: coredns.Environment {
+            cfg+: {
+                cluster_domains: [
+                    "cluster.local",
+                    cluster.fqdn,
+                ],
+            },
+        },
+
+        // Metrics Server
+        metrics: metrics.Environment {},
+
+        // Metal Load Balancer
+        metallb: metallb.Environment {},
+
+        // Main nginx Ingress Controller
+        nginx: nginx.Environment {},
+
+        // Cert-manager (Let's Encrypt, CA, ...)
+        certmanager: certmanager.Environment {},
+
+        issuer: kube.ClusterIssuer("letsencrypt-prod") {
+            spec: {
+                acme: {
+                    server: "https://acme-v02.api.letsencrypt.org/directory",
+                    email: "bofh@hackerspace.pl",
+                    privateKeySecretRef: {
+                        name: "letsencrypt-prod"
+                    },
+                    http01: {},
+                },
+            },
+        },
+
+        // Rook Ceph storage operator.
+        rook: rook.Operator {
+            operator+: {
+                spec+: {
+                    replicas: 1,
+                },
+            },
+        },
+
+        // TLS PKI machinery (compatibility with mirko)
+        pki: pki.Environment(cluster.short, cluster.realm),
+
+        // Prodvider
+        prodvider: prodvider.Environment {
+            cfg+: {
+                apiEndpoint: "kubernetes.default.svc.%s" % [cluster.fqdn],
+            },
+        },
+    },
+}

diff --git a/cluster/kube/k0-ceph.jsonnet b/cluster/kube/k0-ceph.jsonnet
new file mode 100644
index 0000000..bc025d4
--- /dev/null
+++ b/cluster/kube/k0-ceph.jsonnet

@@ -0,0 +1,8 @@
+// Ceph operator (rook), pools, users.
+
+local k0 = (import "k0.libsonnet").k0;
+
+{
+    rook: k0.cluster.rook,
+    ceph: k0.ceph,
+}

diff --git a/cluster/kube/k0-core.jsonnet b/cluster/kube/k0-core.jsonnet
new file mode 100644
index 0000000..06c282e
--- /dev/null
+++ b/cluster/kube/k0-core.jsonnet

@@ -0,0 +1,6 @@
+// Only the 'core' cluster resources - ie., resource non specific to k0 in particular.
+// Without Rook, to speed things up.
+
+(import "k0.libsonnet").k0.cluster {
+    rook+:: {},
+}

diff --git a/cluster/kube/k0-registry.jsonnet b/cluster/kube/k0-registry.jsonnet
new file mode 100644
index 0000000..a2a6061
--- /dev/null
+++ b/cluster/kube/k0-registry.jsonnet

@@ -0,0 +1,3 @@
+// Only the registry running in k0.
+
+(import "k0.libsonnet").k0.registry

diff --git a/cluster/kube/k0.jsonnet b/cluster/kube/k0.jsonnet
new file mode 100644
index 0000000..9658830
--- /dev/null
+++ b/cluster/kube/k0.jsonnet

@@ -0,0 +1,3 @@
+// Everything in the k0 cluster definition.
+
+(import "k0.libsonnet").k0

diff --git a/cluster/kube/k0.libsonnet b/cluster/kube/k0.libsonnet
new file mode 100644
index 0000000..d4c7256
--- /dev/null
+++ b/cluster/kube/k0.libsonnet

@@ -0,0 +1,338 @@
+// k0.hswaw.net kubernetes cluster
+// This defines the cluster as a single object.
+// Use the sibling k0*.jsonnet 'view' files to actually apply the configuration.
+
+local kube = import "../../kube/kube.libsonnet";
+local policies = import "../../kube/policies.libsonnet";
+
+local cluster = import "cluster.libsonnet";
+
+local cockroachdb = import "lib/cockroachdb.libsonnet";
+local registry = import "lib/registry.libsonnet";
+local rook = import "lib/rook.libsonnet";
+
+{
+    k0: {
+        local k0 = self,
+        cluster: cluster.Cluster("k0", "hswaw.net") {
+            cfg+: {
+                storageClassNameParanoid: k0.ceph.waw2Pools.blockParanoid.name,
+            },
+            metallb+: {
+                cfg+: {
+                    peers: [
+                        {
+                            "peer-address": "185.236.240.33",
+                            "peer-asn": 65001,
+                            "my-asn": 65002,
+                        },
+                    ],
+                    addressPools: [
+                        {
+                            name: "public-v4-1",
+                            protocol: "bgp",
+                            addresses: [
+                                "185.236.240.48/28",
+                            ],
+                        },
+                        {
+                            name: "public-v4-2",
+                            protocol: "bgp",
+                            addresses: [
+                                "185.236.240.112/28"
+                            ],
+                        },
+                    ],
+                },
+            },
+        },
+
+        // Docker registry
+        registry: registry.Environment {
+            cfg+: {
+                domain: "registry.%s" % [k0.cluster.fqdn],
+                storageClassName: k0.cluster.cfg.storageClassNameParanoid,
+                objectStorageName: "waw-hdd-redundant-2-object",
+            },
+        },
+
+        // CockroachDB, running on bc01n{01,02,03}.
+        cockroach: {
+            waw2: cockroachdb.Cluster("crdb-waw1") {
+                cfg+: {
+                    topology: [
+                        { name: "bc01n01", node: "bc01n01.hswaw.net" },
+                        { name: "bc01n02", node: "bc01n02.hswaw.net" },
+                        { name: "bc01n03", node: "bc01n03.hswaw.net" },
+                    ],
+                    // Host path on SSD.
+                    hostPath: "/var/db/crdb-waw1",
+                },
+            },
+            clients: {
+                cccampix: k0.cockroach.waw2.Client("cccampix"),
+                cccampixDev: k0.cockroach.waw2.Client("cccampix-dev"),
+                buglessDev: k0.cockroach.waw2.Client("bugless-dev"),
+                sso: k0.cockroach.waw2.Client("sso"),
+            },
+        },
+
+        ceph: {
+            // waw1 cluster - dead as of 2019/08/06, data corruption
+            // waw2 cluster: shitty 7200RPM 2.5" HDDs
+            waw2: rook.Cluster(k0.cluster.rook, "ceph-waw2") {
+                spec: {
+                    mon: {
+                        count: 3,
+                        allowMultiplePerNode: false,
+                    },
+                    storage: {
+                        useAllNodes: false,
+                        useAllDevices: false,
+                        config: {
+                            databaseSizeMB: "1024",
+                            journalSizeMB: "1024",
+                        },
+                        nodes: [
+                            {
+                                name: "bc01n01.hswaw.net",
+                                location: "rack=dcr01 chassis=bc01 host=bc01n01",
+                                devices: [ { name: "sda" } ],
+                            },
+                            {
+                                name: "bc01n02.hswaw.net",
+                                location: "rack=dcr01 chassis=bc01 host=bc01n02",
+                                devices: [ { name: "sda" } ],
+                            },
+                            {
+                                name: "bc01n03.hswaw.net",
+                                location: "rack=dcr01 chassis=bc01 host=bc01n03",
+                                devices: [ { name: "sda" } ],
+                            },
+                        ],
+                    },
+                    benji:: {
+                        metadataStorageClass: "waw-hdd-paranoid-2",
+                        encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
+                        pools: [
+                            "waw-hdd-redundant-2",
+                            "waw-hdd-redundant-2-metadata",
+                            "waw-hdd-paranoid-2",
+                            "waw-hdd-yolo-2",
+                        ],
+                        s3Configuration: {
+                            awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
+                            awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
+                            bucketName: "benji-k0-backups",
+                            endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
+                        },
+                    }
+                },
+            },
+            waw2Pools: {
+                // redundant block storage
+                blockRedundant: rook.ECBlockPool(k0.ceph.waw2, "waw-hdd-redundant-2") {
+                    spec: {
+                        failureDomain: "host",
+                        erasureCoded: {
+                            dataChunks: 2,
+                            codingChunks: 1,
+                        },
+                    },
+                },
+                // paranoid block storage (3 replicas)
+                blockParanoid: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-paranoid-2") {
+                    spec: {
+                        failureDomain: "host",
+                        replicated: {
+                            size: 3,
+                        },
+                    },
+                },
+                // yolo block storage (no replicas!)
+                blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-yolo-2") {
+                    spec: {
+                        failureDomain: "host",
+                        replicated: {
+                            size: 1,
+                        },
+                    },
+                },
+                objectRedundant: rook.S3ObjectStore(k0.ceph.waw2, "waw-hdd-redundant-2-object") {
+                    spec: {
+                        metadataPool: {
+                            failureDomain: "host",
+                            replicated: { size: 3 },
+                        },
+                        dataPool: {
+                            failureDomain: "host",
+                            erasureCoded: {
+                                dataChunks: 2,
+                                codingChunks: 1,
+                            },
+                        },
+                    },
+                },
+            },
+
+            // waw3: 6TB SAS 3.5" HDDs
+            waw3: rook.Cluster(k0.cluster.rook, "ceph-waw3") {
+                spec: {
+                    mon: {
+                        count: 3,
+                        allowMultiplePerNode: false,
+                    },
+                    storage: {
+                        useAllNodes: false,
+                        useAllDevices: false,
+                        config: {
+                            databaseSizeMB: "1024",
+                            journalSizeMB: "1024",
+                        },
+                        nodes: [
+                            {
+                                name: "dcr01s22.hswaw.net",
+                                location: "rack=dcr01 host=dcr01s22",
+                                devices: [
+                                    // https://github.com/rook/rook/issues/1228
+                                    //{ name: "disk/by-id/wwan-0x" + wwan }
+                                    //for wwan in [
+                                    //    "5000c5008508c433",
+                                    //    "5000c500850989cf",
+                                    //    "5000c5008508f843",
+                                    //    "5000c5008508baf7",
+                                    //]
+                                    { name: "sdn" },
+                                    { name: "sda" },
+                                    { name: "sdb" },
+                                    { name: "sdc" },
+                                ],
+                            },
+                            {
+                                name: "dcr01s24.hswaw.net",
+                                location: "rack=dcr01 host=dcr01s22",
+                                devices: [
+                                    // https://github.com/rook/rook/issues/1228
+                                    //{ name: "disk/by-id/wwan-0x" + wwan }
+                                    //for wwan in [
+                                    //    "5000c5008508ee03",
+                                    //    "5000c5008508c9ef",
+                                    //    "5000c5008508df33",
+                                    //    "5000c5008508dd3b",
+                                    //]
+                                    { name: "sdm" },
+                                    { name: "sda" },
+                                    { name: "sdb" },
+                                    { name: "sdc" },
+                                ],
+                            },
+                        ],
+                    },
+                    benji:: {
+                        metadataStorageClass: "waw-hdd-redundant-3",
+                        encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
+                        pools: [
+                            "waw-hdd-redundant-3",
+                            "waw-hdd-redundant-3-metadata",
+                            "waw-hdd-yolo-3",
+                        ],
+                        s3Configuration: {
+                            awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
+                            awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
+                            bucketName: "benji-k0-backups-waw3",
+                            endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
+                        },
+                    }
+                },
+            },
+            waw3Pools: {
+                // redundant block storage
+                blockRedundant: rook.ECBlockPool(k0.ceph.waw3, "waw-hdd-redundant-3") {
+                    metadataReplicas: 2,
+                    spec: {
+                        failureDomain: "host",
+                        replicated: {
+                          size: 2,
+                        },
+                    },
+                },
+                // yolo block storage (low usage, no host redundancy)
+                blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw3, "waw-hdd-yolo-3") {
+                    spec: {
+                        failureDomain: "osd",
+                        erasureCoded: {
+                            dataChunks: 12,
+                            codingChunks: 4,
+                        },
+                    },
+                },
+                objectRedundant: rook.S3ObjectStore(k0.ceph.waw3, "waw-hdd-redundant-3-object") {
+                    spec: {
+                        metadataPool: {
+                            failureDomain: "host",
+                            replicated: { size: 2 },
+                        },
+                        dataPool: {
+                            failureDomain: "host",
+                            replicated: { size: 2 },
+                        },
+                    },
+                },
+            },
+
+            // Clients for S3/radosgw storage.
+            clients: {
+                # Used for owncloud.hackerspace.pl, which for now lives on boston-packets.hackerspace.pl.
+                nextcloudWaw3: kube.CephObjectStoreUser("nextcloud") {
+                    metadata+: {
+                        namespace: "ceph-waw3",
+                    },
+                    spec: {
+                        store: "waw-hdd-redundant-3-object",
+                        displayName: "nextcloud",
+                    },
+                },
+
+                # nuke@hackerspace.pl's personal storage.
+                nukePersonalWaw3: kube.CephObjectStoreUser("nuke-personal") {
+                    metadata+: {
+                        namespace: "ceph-waw3",
+                    },
+                    spec: {
+                        store: "waw-hdd-redundant-3-object",
+                        displayName: "nuke-personal",
+                    },
+                },
+
+                # patryk@hackerspace.pl's ArmA3 mod bucket.
+                cz2ArmaModsWaw3: kube.CephObjectStoreUser("cz2-arma3mods") {
+                    metadata+: {
+                        namespace: "ceph-waw3",
+                    },
+                    spec: {
+                        store: "waw-hdd-redundant-3-object",
+                        displayName: "cz2-arma3mods",
+                    },
+                },
+            },
+        },
+
+
+        # These are policies allowing for Insecure pods in some namespaces.
+        # A lot of them are spurious and come from the fact that we deployed
+        # these namespaces before we deployed the draconian PodSecurityPolicy
+        # we have now. This should be fixed by setting up some more granular
+        # policies, or fixing the workloads to not need some of the permission
+        # bits they use, whatever those might be.
+        # TODO(q3k): fix this?
+        unnecessarilyInsecureNamespaces: [
+            policies.AllowNamespaceInsecure("ceph-waw2"),
+            policies.AllowNamespaceInsecure("ceph-waw3"),
+            policies.AllowNamespaceInsecure("matrix"),
+            policies.AllowNamespaceInsecure("registry"),
+            policies.AllowNamespaceInsecure("internet"),
+            # TODO(implr): restricted policy with CAP_NET_ADMIN and tuntap, but no full root
+            policies.AllowNamespaceInsecure("implr-vpn"),
+        ],
+    },
+}
commit	dbfa988c73a39403a5eceb1f2162b02be903fa13	[log] [tgz]
author	Sergiusz Bazanski <q3k@hackerspace.pl>	Sat Jun 06 01:21:45 2020 +0200
committer	Sergiusz Bazanski <q3k@hackerspace.pl>	Sat Jun 13 19:51:58 2020 +0200
tree	b926aa28e020102aaf802aac46c3a9deadb74ee9
parent	30f9d03106e65ede9c54f84463999fc09700860e [diff]