cluster/kube: split up cluster.jsonnet
It was getting large and unwieldy (to the point where kubecfg was slow).
In this change, we:
- move the Cluster function to cluster.libsonnet
- move the Cluster instantiation into k0.libsonnet
- shuffle some fields around to make sure things are well split between
k0-specific and general cluster configs.
- add 'view' files that build on 'cluster.libsonnet' to allow rendering
either the entire k0 state, or some subsets (for speed)
- update the documentation, drive-by some small fixes and reindantation
Change-Id: I4b8d920b600df79100295267efe21b8c82699d5b
diff --git a/cluster/doc/admin.md b/cluster/doc/admin.md
index 27b30ca..1dfb50a 100644
--- a/cluster/doc/admin.md
+++ b/cluster/doc/admin.md
@@ -8,7 +8,9 @@
Persistent Storage (waw2)
-------------------------
-HDDs on bc01n0{1-3}. 3TB total capacity. Don't use this as this pool should go away soon (the disks are slow, the network is slow and the RAID controllers lie). Use ceph-waw3 instead.
+HDDs on bc01n0{1-3}. 3TB total capacity. Don't use this as this pool should go
+away soon (the disks are slow, the network is slow and the RAID controllers
+lie). Use ceph-waw3 instead.
The following storage classes use this cluster:
@@ -17,9 +19,12 @@
- `waw-hdd-yolo-1` - unreplicated (you _will_ lose your data)
- `waw-hdd-redundant-1-object` - erasure coded 2.1 object store
-Rados Gateway (S3) is available at https://object.ceph-waw2.hswaw.net/. To create a user, ask an admin.
+Rados Gateway (S3) is available at https://object.ceph-waw2.hswaw.net/. To
+create a user, ask an admin.
-PersistentVolumes currently bound to PersistentVolumeClaims get automatically backed up (hourly for the next 48 hours, then once every 4 weeks, then once every month for a year).
+PersistentVolumes currently bound to PersistentVolumeClaims get automatically
+backed up (hourly for the next 48 hours, then once every 4 weeks, then once
+every month for a year).
Persistent Storage (waw3)
-------------------------
@@ -32,9 +37,12 @@
- `waw-hdd-redundant-3` - 2 replicas
- `waw-hdd-redundant-3-object` - 2 replicas, object store
-Rados Gateway (S3) is available at https://object.ceph-waw3.hswaw.net/. To create a user, ask an admin.
+Rados Gateway (S3) is available at https://object.ceph-waw3.hswaw.net/. To
+create a user, ask an admin.
-PersistentVolumes currently bound to PVCs get automatically backed up (hourly for the next 48 hours, then once every 4 weeks, then once every month for a year).
+PersistentVolumes currently bound to PVCs get automatically backed up (hourly
+for the next 48 hours, then once every 4 weeks, then once every month for a
+year).
Administration
==============
@@ -42,25 +50,55 @@
Provisioning nodes
------------------
- - bring up a new node with nixos, the configuration doesn't matter and will be nuked anyway
+ - bring up a new node with nixos, the configuration doesn't matter and will be
+ nuked anyway
- edit cluster/nix/defs-machines.nix
- `bazel run //cluster/clustercfg nodestrap bc01nXX.hswaw.net`
+Applying kubecfg state
+----------------------
+
+First, decrypt/sync all secrets:
+
+ secretstore sync cluster/secrets/
+
+Then, run kubecfg. There's multiple top-level 'view' files that you can run,
+all located in `//cluster/kube`. All of them use `k0.libsonnet` as the master
+state of Kubernetes configuration, just expose subsets of it to work around the
+fact that kubecfg gets somewhat slow with a lot of resources.
+
+ - `k0.jsonnet`: everything that is defined for k0 in `//cluster/kube/...`.
+ - `k0-core.jsonnet`: definitions that re in common across all clusters
+ (networking, registry, etc), without Rook.
+ - `k0-registry.jsonnet`: just the docker registry on k0 (useful when changing
+ ACLs).
+ - `k0-ceph.jsonnet`: everything ceph/rook related on k0.
+
+When in doubt, run `k0.jsonnet`. There's no harm in doing it, it might just be
+slow. Running individual files without realizing that whatever change you
+implemented also influenced something that was rendered in another file can
+cause to production inconsistencies.
+
+Feel free to add more view files for typical administrative tasks.
+
Ceph - Debugging
-----------------
-We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system` namespace. To debug Ceph issues, start by looking at its logs.
+We run Ceph via Rook. The Rook operator is running in the `ceph-rook-system`
+namespace. To debug Ceph issues, start by looking at its logs.
-A dashboard is available at https://ceph-waw2.hswaw.net/ and https://ceph-waw3.hswaw.net, to get the admin password run:
+A dashboard is available at https://ceph-waw2.hswaw.net/ and
+https://ceph-waw3.hswaw.net, to get the admin password run:
kubectl -n ceph-waw2 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
- kubectl -n ceph-waw2 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
+ kubectl -n ceph-waw3 get secret rook-ceph-dashboard-password -o yaml | grep "password:" | awk '{print $2}' | base64 --decode ; echo
Ceph - Backups
--------------
-Kubernetes PVs backed in Ceph RBDs get backed up using Benji. An hourly cronjob runs in every Ceph cluster. You can also manually trigger a run by doing:
+Kubernetes PVs backed in Ceph RBDs get backed up using Benji. An hourly cronjob
+runs in every Ceph cluster. You can also manually trigger a run by doing:
kubectl -n ceph-waw2 create job --from=cronjob/ceph-waw2-benji ceph-waw2-benji-manual-$(date +%s)
kubectl -n ceph-waw3 create job --from=cronjob/ceph-waw3-benji ceph-waw3-benji-manual-$(date +%s)
@@ -70,10 +108,12 @@
Ceph - Object Storage
---------------------
-To create an object store user consult rook.io manual (https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html)
-User authentication secret is generated in ceph cluster namespace (`ceph-waw2`),
-thus may need to be manually copied into application namespace. (see `app/registry/prod.jsonnet` comment)
+To create an object store user consult rook.io manual
+(https://rook.io/docs/rook/v0.9/ceph-object-store-user-crd.html).
+User authentication secret is generated in ceph cluster namespace
+(`ceph-waw{2,3}`), thus may need to be manually copied into application namespace.
+(see `app/registry/prod.jsonnet` comment)
-`tools/rook-s3cmd-config` can be used to generate test configuration file for s3cmd.
-Remember to append `:default-placement` to your region name (ie. `waw-hdd-redundant-1-object:default-placement`)
-
+`tools/rook-s3cmd-config` can be used to generate test configuration file for
+s3cmd. Remember to append `:default-placement` to your region name (ie.
+`waw-hdd-redundant-3-object:default-placement`)
diff --git a/cluster/kube/cluster.jsonnet b/cluster/kube/cluster.jsonnet
deleted file mode 100644
index 02099cd..0000000
--- a/cluster/kube/cluster.jsonnet
+++ /dev/null
@@ -1,519 +0,0 @@
-# Top level cluster configuration.
-
-local kube = import "../../kube/kube.libsonnet";
-local policies = import "../../kube/policies.libsonnet";
-
-local calico = import "lib/calico.libsonnet";
-local certmanager = import "lib/cert-manager.libsonnet";
-local cockroachdb = import "lib/cockroachdb.libsonnet";
-local coredns = import "lib/coredns.libsonnet";
-local metallb = import "lib/metallb.libsonnet";
-local metrics = import "lib/metrics.libsonnet";
-local nginx = import "lib/nginx.libsonnet";
-local prodvider = import "lib/prodvider.libsonnet";
-local registry = import "lib/registry.libsonnet";
-local rook = import "lib/rook.libsonnet";
-local pki = import "lib/pki.libsonnet";
-
-local Cluster(short, realm) = {
- local cluster = self,
- local cfg = cluster.cfg,
-
- short:: short,
- realm:: realm,
- fqdn:: "%s.%s" % [cluster.short, cluster.realm],
-
- cfg:: {
- // Storage class used for internal services (like registry). This must
- // be set to a valid storage class. This can either be a cloud provider class
- // (when running on GKE &co) or a storage class created using rook.
- storageClassNameRedundant: error "storageClassNameRedundant must be set",
- },
-
- // These are required to let the API Server contact kubelets.
- crAPIServerToKubelet: kube.ClusterRole("system:kube-apiserver-to-kubelet") {
- metadata+: {
- annotations+: {
- "rbac.authorization.kubernetes.io/autoupdate": "true",
- },
- labels+: {
- "kubernetes.io/bootstrapping": "rbac-defaults",
- },
- },
- rules: [
- {
- apiGroups: [""],
- resources: ["nodes/%s" % r for r in [ "proxy", "stats", "log", "spec", "metrics" ]],
- verbs: ["*"],
- },
- ],
- },
- crbAPIServer: kube.ClusterRoleBinding("system:kube-apiserver") {
- roleRef: {
- apiGroup: "rbac.authorization.k8s.io",
- kind: "ClusterRole",
- name: cluster.crAPIServerToKubelet.metadata.name,
- },
- subjects: [
- {
- apiGroup: "rbac.authorization.k8s.io",
- kind: "User",
- # A cluster API Server authenticates with a certificate whose CN is == to the FQDN of the cluster.
- name: cluster.fqdn,
- },
- ],
- },
-
- // This ClusteRole is bound to all humans that log in via prodaccess/prodvider/SSO.
- // It should allow viewing of non-sensitive data for debugability and openness.
- crViewer: kube.ClusterRole("system:viewer") {
- rules: [
- {
- apiGroups: [""],
- resources: [
- "nodes",
- "namespaces",
- "pods",
- "configmaps",
- "services",
- ],
- verbs: ["list"],
- },
- {
- apiGroups: ["metrics.k8s.io"],
- resources: [
- "nodes",
- "pods",
- ],
- verbs: ["list"],
- },
- {
- apiGroups: ["apps"],
- resources: [
- "statefulsets",
- ],
- verbs: ["list"],
- },
- {
- apiGroups: ["extensions"],
- resources: [
- "deployments",
- "ingresses",
- ],
- verbs: ["list"],
- }
- ],
- },
- // This ClusterRole is applied (scoped to personal namespace) to all humans.
- crFullInNamespace: kube.ClusterRole("system:admin-namespace") {
- rules: [
- {
- apiGroups: ["", "extensions", "apps"],
- resources: ["*"],
- verbs: ["*"],
- },
- {
- apiGroups: ["batch"],
- resources: ["jobs", "cronjobs"],
- verbs: ["*"],
- },
- ],
- },
- // This ClusterRoleBindings allows root access to cluster admins.
- crbAdmins: kube.ClusterRoleBinding("system:admins") {
- roleRef: {
- apiGroup: "rbac.authorization.k8s.io",
- kind: "ClusterRole",
- name: "cluster-admin",
- },
- subjects: [
- {
- apiGroup: "rbac.authorization.k8s.io",
- kind: "User",
- name: user + "@hackerspace.pl",
- } for user in [
- "q3k",
- "implr",
- "informatic",
- ]
- ],
- },
-
- podSecurityPolicies: policies.Cluster {},
-
- allowInsecureNamespaces: [
- policies.AllowNamespaceInsecure("kube-system"),
- policies.AllowNamespaceInsecure("metallb-system"),
- # TODO(q3k): fix this?
- policies.AllowNamespaceInsecure("ceph-waw2"),
- policies.AllowNamespaceInsecure("ceph-waw3"),
- policies.AllowNamespaceInsecure("matrix"),
- policies.AllowNamespaceInsecure("registry"),
- policies.AllowNamespaceInsecure("internet"),
- # TODO(implr): restricted policy with CAP_NET_ADMIN and tuntap, but no full root
- policies.AllowNamespaceInsecure("implr-vpn"),
- ],
-
- // Allow all service accounts (thus all controllers) to create secure pods.
- crbAllowServiceAccountsSecure: kube.ClusterRoleBinding("policy:allow-all-secure") {
- roleRef_: cluster.podSecurityPolicies.secureRole,
- subjects: [
- {
- kind: "Group",
- apiGroup: "rbac.authorization.k8s.io",
- name: "system:serviceaccounts",
- }
- ],
- },
-
- // Calico network fabric
- calico: calico.Environment {},
- // CoreDNS for this cluster.
- dns: coredns.Environment {
- cfg+: {
- cluster_domains: [
- "cluster.local",
- cluster.fqdn,
- ],
- },
- },
- // Metrics Server
- metrics: metrics.Environment {},
- // Metal Load Balancer
- metallb: metallb.Environment {
- cfg+: {
- peers: [
- {
- "peer-address": "185.236.240.33",
- "peer-asn": 65001,
- "my-asn": 65002,
- },
- ],
- addressPools: [
- {
- name: "public-v4-1",
- protocol: "bgp",
- addresses: [
- "185.236.240.48/28",
- ],
- },
- {
- name: "public-v4-2",
- protocol: "bgp",
- addresses: [
- "185.236.240.112/28"
- ],
- },
- ],
- },
- },
- // Main nginx Ingress Controller
- nginx: nginx.Environment {},
- certmanager: certmanager.Environment {},
- issuer: kube.ClusterIssuer("letsencrypt-prod") {
- spec: {
- acme: {
- server: "https://acme-v02.api.letsencrypt.org/directory",
- email: "bofh@hackerspace.pl",
- privateKeySecretRef: {
- name: "letsencrypt-prod"
- },
- http01: {},
- },
- },
- },
-
- // Rook Ceph storage
- rook: rook.Operator {
- operator+: {
- spec+: {
- // TODO(q3k): Bring up the operator again when stability gets fixed
- // See: https://github.com/rook/rook/issues/3059#issuecomment-492378873
- replicas: 1,
- },
- },
- },
-
- // Docker registry
- registry: registry.Environment {
- cfg+: {
- domain: "registry.%s" % [cluster.fqdn],
- storageClassName: cfg.storageClassNameParanoid,
- objectStorageName: "waw-hdd-redundant-2-object",
- },
- },
-
- // TLS PKI machinery
- pki: pki.Environment(cluster.short, cluster.realm),
-
- // Prodvider
- prodvider: prodvider.Environment {
- cfg+: {
- apiEndpoint: "kubernetes.default.svc.%s" % [cluster.fqdn],
- },
- },
-};
-
-
-{
- k0: {
- local k0 = self,
- cluster: Cluster("k0", "hswaw.net") {
- cfg+: {
- storageClassNameParanoid: k0.ceph.waw2Pools.blockParanoid.name,
- },
- },
- cockroach: {
- waw2: cockroachdb.Cluster("crdb-waw1") {
- cfg+: {
- topology: [
- { name: "bc01n01", node: "bc01n01.hswaw.net" },
- { name: "bc01n02", node: "bc01n02.hswaw.net" },
- { name: "bc01n03", node: "bc01n03.hswaw.net" },
- ],
- hostPath: "/var/db/crdb-waw1",
- },
- },
- clients: {
- cccampix: k0.cockroach.waw2.Client("cccampix"),
- cccampixDev: k0.cockroach.waw2.Client("cccampix-dev"),
- buglessDev: k0.cockroach.waw2.Client("bugless-dev"),
- sso: k0.cockroach.waw2.Client("sso"),
- },
- },
- ceph: {
- // waw1 cluster - dead as of 2019/08/06, data corruption
- // waw2 cluster
- waw2: rook.Cluster(k0.cluster.rook, "ceph-waw2") {
- spec: {
- mon: {
- count: 3,
- allowMultiplePerNode: false,
- },
- storage: {
- useAllNodes: false,
- useAllDevices: false,
- config: {
- databaseSizeMB: "1024",
- journalSizeMB: "1024",
- },
- nodes: [
- {
- name: "bc01n01.hswaw.net",
- location: "rack=dcr01 chassis=bc01 host=bc01n01",
- devices: [ { name: "sda" } ],
- },
- {
- name: "bc01n02.hswaw.net",
- location: "rack=dcr01 chassis=bc01 host=bc01n02",
- devices: [ { name: "sda" } ],
- },
- {
- name: "bc01n03.hswaw.net",
- location: "rack=dcr01 chassis=bc01 host=bc01n03",
- devices: [ { name: "sda" } ],
- },
- ],
- },
- benji:: {
- metadataStorageClass: "waw-hdd-paranoid-2",
- encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
- pools: [
- "waw-hdd-redundant-2",
- "waw-hdd-redundant-2-metadata",
- "waw-hdd-paranoid-2",
- "waw-hdd-yolo-2",
- ],
- s3Configuration: {
- awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
- awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
- bucketName: "benji-k0-backups",
- endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
- },
- }
- },
- },
- waw2Pools: {
- // redundant block storage
- blockRedundant: rook.ECBlockPool(k0.ceph.waw2, "waw-hdd-redundant-2") {
- spec: {
- failureDomain: "host",
- erasureCoded: {
- dataChunks: 2,
- codingChunks: 1,
- },
- },
- },
- // paranoid block storage (3 replicas)
- blockParanoid: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-paranoid-2") {
- spec: {
- failureDomain: "host",
- replicated: {
- size: 3,
- },
- },
- },
- // yolo block storage (no replicas!)
- blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-yolo-2") {
- spec: {
- failureDomain: "host",
- replicated: {
- size: 1,
- },
- },
- },
- objectRedundant: rook.S3ObjectStore(k0.ceph.waw2, "waw-hdd-redundant-2-object") {
- spec: {
- metadataPool: {
- failureDomain: "host",
- replicated: { size: 3 },
- },
- dataPool: {
- failureDomain: "host",
- erasureCoded: {
- dataChunks: 2,
- codingChunks: 1,
- },
- },
- },
- },
- },
- waw3: rook.Cluster(k0.cluster.rook, "ceph-waw3") {
- spec: {
- mon: {
- count: 3,
- allowMultiplePerNode: false,
- },
- storage: {
- useAllNodes: false,
- useAllDevices: false,
- config: {
- databaseSizeMB: "1024",
- journalSizeMB: "1024",
- },
- nodes: [
- {
- name: "dcr01s22.hswaw.net",
- location: "rack=dcr01 host=dcr01s22",
- devices: [
- // https://github.com/rook/rook/issues/1228
- //{ name: "disk/by-id/wwan-0x" + wwan }
- //for wwan in [
- // "5000c5008508c433",
- // "5000c500850989cf",
- // "5000c5008508f843",
- // "5000c5008508baf7",
- //]
- { name: "sdn" },
- { name: "sda" },
- { name: "sdb" },
- { name: "sdc" },
- ],
- },
- {
- name: "dcr01s24.hswaw.net",
- location: "rack=dcr01 host=dcr01s22",
- devices: [
- // https://github.com/rook/rook/issues/1228
- //{ name: "disk/by-id/wwan-0x" + wwan }
- //for wwan in [
- // "5000c5008508ee03",
- // "5000c5008508c9ef",
- // "5000c5008508df33",
- // "5000c5008508dd3b",
- //]
- { name: "sdm" },
- { name: "sda" },
- { name: "sdb" },
- { name: "sdc" },
- ],
- },
- ],
- },
- benji:: {
- metadataStorageClass: "waw-hdd-redundant-3",
- encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
- pools: [
- "waw-hdd-redundant-3",
- "waw-hdd-redundant-3-metadata",
- "waw-hdd-yolo-3",
- ],
- s3Configuration: {
- awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
- awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
- bucketName: "benji-k0-backups-waw3",
- endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
- },
- }
- },
- },
- waw3Pools: {
- // redundant block storage
- blockRedundant: rook.ECBlockPool(k0.ceph.waw3, "waw-hdd-redundant-3") {
- metadataReplicas: 2,
- spec: {
- failureDomain: "host",
- replicated: {
- size: 2,
- },
- },
- },
- // yolo block storage (low usage, no host redundancy)
- blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw3, "waw-hdd-yolo-3") {
- spec: {
- failureDomain: "osd",
- erasureCoded: {
- dataChunks: 12,
- codingChunks: 4,
- },
- },
- },
- objectRedundant: rook.S3ObjectStore(k0.ceph.waw3, "waw-hdd-redundant-3-object") {
- spec: {
- metadataPool: {
- failureDomain: "host",
- replicated: { size: 2 },
- },
- dataPool: {
- failureDomain: "host",
- replicated: { size: 2 },
- },
- },
- },
- },
- },
-
- # Used for owncloud.hackerspace.pl, which for now lices on boston-packets.hackerspace.pl.
- nextcloudWaw3: kube.CephObjectStoreUser("nextcloud") {
- metadata+: {
- namespace: "ceph-waw3",
- },
- spec: {
- store: "waw-hdd-redundant-3-object",
- displayName: "nextcloud",
- },
- },
-
- # nuke@hackerspace.pl's personal storage.
- nukePersonalWaw3: kube.CephObjectStoreUser("nuke-personal") {
- metadata+: {
- namespace: "ceph-waw3",
- },
- spec: {
- store: "waw-hdd-redundant-3-object",
- displayName: "nuke-personal",
- },
- },
-
- # patryk@hackerspace.pl's ArmA3 mod bucket.
- cz2ArmaModsWaw3: kube.CephObjectStoreUser("cz2-arma3mods") {
- metadata+: {
- namespace: "ceph-waw3",
- },
- spec: {
- store: "waw-hdd-redundant-3-object",
- displayName: "cz2-arma3mods",
- },
- },
- },
-}
diff --git a/cluster/kube/cluster.libsonnet b/cluster/kube/cluster.libsonnet
new file mode 100644
index 0000000..c42ee8a
--- /dev/null
+++ b/cluster/kube/cluster.libsonnet
@@ -0,0 +1,221 @@
+# Common cluster configuration.
+# This defines what Kubernetes resources are required to turn a bare k8s
+# deployment into a fully working cluster.
+# These assume that you're running on bare metal, and using the corresponding
+# NixOS deployment that we do.
+
+local kube = import "../../kube/kube.libsonnet";
+local policies = import "../../kube/policies.libsonnet";
+
+local calico = import "lib/calico.libsonnet";
+local certmanager = import "lib/cert-manager.libsonnet";
+local coredns = import "lib/coredns.libsonnet";
+local metallb = import "lib/metallb.libsonnet";
+local metrics = import "lib/metrics.libsonnet";
+local nginx = import "lib/nginx.libsonnet";
+local prodvider = import "lib/prodvider.libsonnet";
+local rook = import "lib/rook.libsonnet";
+local pki = import "lib/pki.libsonnet";
+
+{
+ Cluster(short, realm):: {
+ local cluster = self,
+ local cfg = cluster.cfg,
+
+ short:: short,
+ realm:: realm,
+ fqdn:: "%s.%s" % [cluster.short, cluster.realm],
+
+ cfg:: {
+ // Storage class used for internal services (like registry). This must
+ // be set to a valid storage class. This can either be a cloud provider class
+ // (when running on GKE &co) or a storage class created using rook.
+ storageClassNameRedundant: error "storageClassNameRedundant must be set",
+ },
+
+ // These are required to let the API Server contact kubelets.
+ crAPIServerToKubelet: kube.ClusterRole("system:kube-apiserver-to-kubelet") {
+ metadata+: {
+ annotations+: {
+ "rbac.authorization.kubernetes.io/autoupdate": "true",
+ },
+ labels+: {
+ "kubernetes.io/bootstrapping": "rbac-defaults",
+ },
+ },
+ rules: [
+ {
+ apiGroups: [""],
+ resources: ["nodes/%s" % r for r in [ "proxy", "stats", "log", "spec", "metrics" ]],
+ verbs: ["*"],
+ },
+ ],
+ },
+ crbAPIServer: kube.ClusterRoleBinding("system:kube-apiserver") {
+ roleRef: {
+ apiGroup: "rbac.authorization.k8s.io",
+ kind: "ClusterRole",
+ name: cluster.crAPIServerToKubelet.metadata.name,
+ },
+ subjects: [
+ {
+ apiGroup: "rbac.authorization.k8s.io",
+ kind: "User",
+ # A cluster API Server authenticates with a certificate whose CN is == to the FQDN of the cluster.
+ name: cluster.fqdn,
+ },
+ ],
+ },
+
+ // This ClusterRole is bound to all humans that log in via prodaccess/prodvider/SSO.
+ // It should allow viewing of non-sensitive data for debugability and openness.
+ crViewer: kube.ClusterRole("system:viewer") {
+ rules: [
+ {
+ apiGroups: [""],
+ resources: [
+ "nodes",
+ "namespaces",
+ "pods",
+ "configmaps",
+ "services",
+ ],
+ verbs: ["list"],
+ },
+ {
+ apiGroups: ["metrics.k8s.io"],
+ resources: [
+ "nodes",
+ "pods",
+ ],
+ verbs: ["list"],
+ },
+ {
+ apiGroups: ["apps"],
+ resources: [
+ "statefulsets",
+ ],
+ verbs: ["list"],
+ },
+ {
+ apiGroups: ["extensions"],
+ resources: [
+ "deployments",
+ "ingresses",
+ ],
+ verbs: ["list"],
+ }
+ ],
+ },
+ // This ClusterRole is applied (scoped to personal namespace) to all humans.
+ crFullInNamespace: kube.ClusterRole("system:admin-namespace") {
+ rules: [
+ {
+ apiGroups: ["", "extensions", "apps"],
+ resources: ["*"],
+ verbs: ["*"],
+ },
+ {
+ apiGroups: ["batch"],
+ resources: ["jobs", "cronjobs"],
+ verbs: ["*"],
+ },
+ ],
+ },
+ // This ClusterRoleBindings allows root access to cluster admins.
+ crbAdmins: kube.ClusterRoleBinding("system:admins") {
+ roleRef: {
+ apiGroup: "rbac.authorization.k8s.io",
+ kind: "ClusterRole",
+ name: "cluster-admin",
+ },
+ subjects: [
+ {
+ apiGroup: "rbac.authorization.k8s.io",
+ kind: "User",
+ name: user + "@hackerspace.pl",
+ } for user in [
+ "q3k",
+ "implr",
+ "informatic",
+ ]
+ ],
+ },
+
+ podSecurityPolicies: policies.Cluster {},
+
+ allowInsecureNamespaces: [
+ policies.AllowNamespaceInsecure("kube-system"),
+ policies.AllowNamespaceInsecure("metallb-system"),
+ ],
+
+ // Allow all service accounts (thus all controllers) to create secure pods.
+ crbAllowServiceAccountsSecure: kube.ClusterRoleBinding("policy:allow-all-secure") {
+ roleRef_: cluster.podSecurityPolicies.secureRole,
+ subjects: [
+ {
+ kind: "Group",
+ apiGroup: "rbac.authorization.k8s.io",
+ name: "system:serviceaccounts",
+ }
+ ],
+ },
+
+ // Calico network fabric
+ calico: calico.Environment {},
+
+ // CoreDNS for this cluster.
+ dns: coredns.Environment {
+ cfg+: {
+ cluster_domains: [
+ "cluster.local",
+ cluster.fqdn,
+ ],
+ },
+ },
+
+ // Metrics Server
+ metrics: metrics.Environment {},
+
+ // Metal Load Balancer
+ metallb: metallb.Environment {},
+
+ // Main nginx Ingress Controller
+ nginx: nginx.Environment {},
+
+ // Cert-manager (Let's Encrypt, CA, ...)
+ certmanager: certmanager.Environment {},
+
+ issuer: kube.ClusterIssuer("letsencrypt-prod") {
+ spec: {
+ acme: {
+ server: "https://acme-v02.api.letsencrypt.org/directory",
+ email: "bofh@hackerspace.pl",
+ privateKeySecretRef: {
+ name: "letsencrypt-prod"
+ },
+ http01: {},
+ },
+ },
+ },
+
+ // Rook Ceph storage operator.
+ rook: rook.Operator {
+ operator+: {
+ spec+: {
+ replicas: 1,
+ },
+ },
+ },
+
+ // TLS PKI machinery (compatibility with mirko)
+ pki: pki.Environment(cluster.short, cluster.realm),
+
+ // Prodvider
+ prodvider: prodvider.Environment {
+ cfg+: {
+ apiEndpoint: "kubernetes.default.svc.%s" % [cluster.fqdn],
+ },
+ },
+ },
+}
diff --git a/cluster/kube/k0-ceph.jsonnet b/cluster/kube/k0-ceph.jsonnet
new file mode 100644
index 0000000..bc025d4
--- /dev/null
+++ b/cluster/kube/k0-ceph.jsonnet
@@ -0,0 +1,8 @@
+// Ceph operator (rook), pools, users.
+
+local k0 = (import "k0.libsonnet").k0;
+
+{
+ rook: k0.cluster.rook,
+ ceph: k0.ceph,
+}
diff --git a/cluster/kube/k0-core.jsonnet b/cluster/kube/k0-core.jsonnet
new file mode 100644
index 0000000..06c282e
--- /dev/null
+++ b/cluster/kube/k0-core.jsonnet
@@ -0,0 +1,6 @@
+// Only the 'core' cluster resources - ie., resource non specific to k0 in particular.
+// Without Rook, to speed things up.
+
+(import "k0.libsonnet").k0.cluster {
+ rook+:: {},
+}
diff --git a/cluster/kube/k0-registry.jsonnet b/cluster/kube/k0-registry.jsonnet
new file mode 100644
index 0000000..a2a6061
--- /dev/null
+++ b/cluster/kube/k0-registry.jsonnet
@@ -0,0 +1,3 @@
+// Only the registry running in k0.
+
+(import "k0.libsonnet").k0.registry
diff --git a/cluster/kube/k0.jsonnet b/cluster/kube/k0.jsonnet
new file mode 100644
index 0000000..9658830
--- /dev/null
+++ b/cluster/kube/k0.jsonnet
@@ -0,0 +1,3 @@
+// Everything in the k0 cluster definition.
+
+(import "k0.libsonnet").k0
diff --git a/cluster/kube/k0.libsonnet b/cluster/kube/k0.libsonnet
new file mode 100644
index 0000000..d4c7256
--- /dev/null
+++ b/cluster/kube/k0.libsonnet
@@ -0,0 +1,338 @@
+// k0.hswaw.net kubernetes cluster
+// This defines the cluster as a single object.
+// Use the sibling k0*.jsonnet 'view' files to actually apply the configuration.
+
+local kube = import "../../kube/kube.libsonnet";
+local policies = import "../../kube/policies.libsonnet";
+
+local cluster = import "cluster.libsonnet";
+
+local cockroachdb = import "lib/cockroachdb.libsonnet";
+local registry = import "lib/registry.libsonnet";
+local rook = import "lib/rook.libsonnet";
+
+{
+ k0: {
+ local k0 = self,
+ cluster: cluster.Cluster("k0", "hswaw.net") {
+ cfg+: {
+ storageClassNameParanoid: k0.ceph.waw2Pools.blockParanoid.name,
+ },
+ metallb+: {
+ cfg+: {
+ peers: [
+ {
+ "peer-address": "185.236.240.33",
+ "peer-asn": 65001,
+ "my-asn": 65002,
+ },
+ ],
+ addressPools: [
+ {
+ name: "public-v4-1",
+ protocol: "bgp",
+ addresses: [
+ "185.236.240.48/28",
+ ],
+ },
+ {
+ name: "public-v4-2",
+ protocol: "bgp",
+ addresses: [
+ "185.236.240.112/28"
+ ],
+ },
+ ],
+ },
+ },
+ },
+
+ // Docker registry
+ registry: registry.Environment {
+ cfg+: {
+ domain: "registry.%s" % [k0.cluster.fqdn],
+ storageClassName: k0.cluster.cfg.storageClassNameParanoid,
+ objectStorageName: "waw-hdd-redundant-2-object",
+ },
+ },
+
+ // CockroachDB, running on bc01n{01,02,03}.
+ cockroach: {
+ waw2: cockroachdb.Cluster("crdb-waw1") {
+ cfg+: {
+ topology: [
+ { name: "bc01n01", node: "bc01n01.hswaw.net" },
+ { name: "bc01n02", node: "bc01n02.hswaw.net" },
+ { name: "bc01n03", node: "bc01n03.hswaw.net" },
+ ],
+ // Host path on SSD.
+ hostPath: "/var/db/crdb-waw1",
+ },
+ },
+ clients: {
+ cccampix: k0.cockroach.waw2.Client("cccampix"),
+ cccampixDev: k0.cockroach.waw2.Client("cccampix-dev"),
+ buglessDev: k0.cockroach.waw2.Client("bugless-dev"),
+ sso: k0.cockroach.waw2.Client("sso"),
+ },
+ },
+
+ ceph: {
+ // waw1 cluster - dead as of 2019/08/06, data corruption
+ // waw2 cluster: shitty 7200RPM 2.5" HDDs
+ waw2: rook.Cluster(k0.cluster.rook, "ceph-waw2") {
+ spec: {
+ mon: {
+ count: 3,
+ allowMultiplePerNode: false,
+ },
+ storage: {
+ useAllNodes: false,
+ useAllDevices: false,
+ config: {
+ databaseSizeMB: "1024",
+ journalSizeMB: "1024",
+ },
+ nodes: [
+ {
+ name: "bc01n01.hswaw.net",
+ location: "rack=dcr01 chassis=bc01 host=bc01n01",
+ devices: [ { name: "sda" } ],
+ },
+ {
+ name: "bc01n02.hswaw.net",
+ location: "rack=dcr01 chassis=bc01 host=bc01n02",
+ devices: [ { name: "sda" } ],
+ },
+ {
+ name: "bc01n03.hswaw.net",
+ location: "rack=dcr01 chassis=bc01 host=bc01n03",
+ devices: [ { name: "sda" } ],
+ },
+ ],
+ },
+ benji:: {
+ metadataStorageClass: "waw-hdd-paranoid-2",
+ encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
+ pools: [
+ "waw-hdd-redundant-2",
+ "waw-hdd-redundant-2-metadata",
+ "waw-hdd-paranoid-2",
+ "waw-hdd-yolo-2",
+ ],
+ s3Configuration: {
+ awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
+ awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
+ bucketName: "benji-k0-backups",
+ endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
+ },
+ }
+ },
+ },
+ waw2Pools: {
+ // redundant block storage
+ blockRedundant: rook.ECBlockPool(k0.ceph.waw2, "waw-hdd-redundant-2") {
+ spec: {
+ failureDomain: "host",
+ erasureCoded: {
+ dataChunks: 2,
+ codingChunks: 1,
+ },
+ },
+ },
+ // paranoid block storage (3 replicas)
+ blockParanoid: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-paranoid-2") {
+ spec: {
+ failureDomain: "host",
+ replicated: {
+ size: 3,
+ },
+ },
+ },
+ // yolo block storage (no replicas!)
+ blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw2, "waw-hdd-yolo-2") {
+ spec: {
+ failureDomain: "host",
+ replicated: {
+ size: 1,
+ },
+ },
+ },
+ objectRedundant: rook.S3ObjectStore(k0.ceph.waw2, "waw-hdd-redundant-2-object") {
+ spec: {
+ metadataPool: {
+ failureDomain: "host",
+ replicated: { size: 3 },
+ },
+ dataPool: {
+ failureDomain: "host",
+ erasureCoded: {
+ dataChunks: 2,
+ codingChunks: 1,
+ },
+ },
+ },
+ },
+ },
+
+ // waw3: 6TB SAS 3.5" HDDs
+ waw3: rook.Cluster(k0.cluster.rook, "ceph-waw3") {
+ spec: {
+ mon: {
+ count: 3,
+ allowMultiplePerNode: false,
+ },
+ storage: {
+ useAllNodes: false,
+ useAllDevices: false,
+ config: {
+ databaseSizeMB: "1024",
+ journalSizeMB: "1024",
+ },
+ nodes: [
+ {
+ name: "dcr01s22.hswaw.net",
+ location: "rack=dcr01 host=dcr01s22",
+ devices: [
+ // https://github.com/rook/rook/issues/1228
+ //{ name: "disk/by-id/wwan-0x" + wwan }
+ //for wwan in [
+ // "5000c5008508c433",
+ // "5000c500850989cf",
+ // "5000c5008508f843",
+ // "5000c5008508baf7",
+ //]
+ { name: "sdn" },
+ { name: "sda" },
+ { name: "sdb" },
+ { name: "sdc" },
+ ],
+ },
+ {
+ name: "dcr01s24.hswaw.net",
+ location: "rack=dcr01 host=dcr01s22",
+ devices: [
+ // https://github.com/rook/rook/issues/1228
+ //{ name: "disk/by-id/wwan-0x" + wwan }
+ //for wwan in [
+ // "5000c5008508ee03",
+ // "5000c5008508c9ef",
+ // "5000c5008508df33",
+ // "5000c5008508dd3b",
+ //]
+ { name: "sdm" },
+ { name: "sda" },
+ { name: "sdb" },
+ { name: "sdc" },
+ ],
+ },
+ ],
+ },
+ benji:: {
+ metadataStorageClass: "waw-hdd-redundant-3",
+ encryptionPassword: std.split((importstr "../secrets/plain/k0-benji-encryption-password"), '\n')[0],
+ pools: [
+ "waw-hdd-redundant-3",
+ "waw-hdd-redundant-3-metadata",
+ "waw-hdd-yolo-3",
+ ],
+ s3Configuration: {
+ awsAccessKeyId: "RPYZIROFXNLQVU2WJ4R3",
+ awsSecretAccessKey: std.split((importstr "../secrets/plain/k0-benji-secret-access-key"), '\n')[0],
+ bucketName: "benji-k0-backups-waw3",
+ endpointUrl: "https://s3.eu-central-1.wasabisys.com/",
+ },
+ }
+ },
+ },
+ waw3Pools: {
+ // redundant block storage
+ blockRedundant: rook.ECBlockPool(k0.ceph.waw3, "waw-hdd-redundant-3") {
+ metadataReplicas: 2,
+ spec: {
+ failureDomain: "host",
+ replicated: {
+ size: 2,
+ },
+ },
+ },
+ // yolo block storage (low usage, no host redundancy)
+ blockYolo: rook.ReplicatedBlockPool(k0.ceph.waw3, "waw-hdd-yolo-3") {
+ spec: {
+ failureDomain: "osd",
+ erasureCoded: {
+ dataChunks: 12,
+ codingChunks: 4,
+ },
+ },
+ },
+ objectRedundant: rook.S3ObjectStore(k0.ceph.waw3, "waw-hdd-redundant-3-object") {
+ spec: {
+ metadataPool: {
+ failureDomain: "host",
+ replicated: { size: 2 },
+ },
+ dataPool: {
+ failureDomain: "host",
+ replicated: { size: 2 },
+ },
+ },
+ },
+ },
+
+ // Clients for S3/radosgw storage.
+ clients: {
+ # Used for owncloud.hackerspace.pl, which for now lives on boston-packets.hackerspace.pl.
+ nextcloudWaw3: kube.CephObjectStoreUser("nextcloud") {
+ metadata+: {
+ namespace: "ceph-waw3",
+ },
+ spec: {
+ store: "waw-hdd-redundant-3-object",
+ displayName: "nextcloud",
+ },
+ },
+
+ # nuke@hackerspace.pl's personal storage.
+ nukePersonalWaw3: kube.CephObjectStoreUser("nuke-personal") {
+ metadata+: {
+ namespace: "ceph-waw3",
+ },
+ spec: {
+ store: "waw-hdd-redundant-3-object",
+ displayName: "nuke-personal",
+ },
+ },
+
+ # patryk@hackerspace.pl's ArmA3 mod bucket.
+ cz2ArmaModsWaw3: kube.CephObjectStoreUser("cz2-arma3mods") {
+ metadata+: {
+ namespace: "ceph-waw3",
+ },
+ spec: {
+ store: "waw-hdd-redundant-3-object",
+ displayName: "cz2-arma3mods",
+ },
+ },
+ },
+ },
+
+
+ # These are policies allowing for Insecure pods in some namespaces.
+ # A lot of them are spurious and come from the fact that we deployed
+ # these namespaces before we deployed the draconian PodSecurityPolicy
+ # we have now. This should be fixed by setting up some more granular
+ # policies, or fixing the workloads to not need some of the permission
+ # bits they use, whatever those might be.
+ # TODO(q3k): fix this?
+ unnecessarilyInsecureNamespaces: [
+ policies.AllowNamespaceInsecure("ceph-waw2"),
+ policies.AllowNamespaceInsecure("ceph-waw3"),
+ policies.AllowNamespaceInsecure("matrix"),
+ policies.AllowNamespaceInsecure("registry"),
+ policies.AllowNamespaceInsecure("internet"),
+ # TODO(implr): restricted policy with CAP_NET_ADMIN and tuntap, but no full root
+ policies.AllowNamespaceInsecure("implr-vpn"),
+ ],
+ },
+}