自建多区域接口拨测监控系统

图片

一、拨测系统架构设计和实现

    废话不多说,直接上架构图:

图片

    如上图所示,业务接口拨测系统采用blackbox_exporter + prometheus实现。通过在不同地域部署blackbox_exporter实现多个探测点,然后通过etcd+confd实现配置的自动下发并生效,实现自动统一配置监控项。

二、配置部署

    1)etcd+confd自动发现的配置实现

    通过confd获取etcd数据后根据模板实现自动化生成配置文件(etcd+confd安装省略):

$ tree.├── blackbox_prom.yaml  // 生成配置到prometheus的job-name文件├── blackbox_config.yaml  // 生成blackbox_exporter的配置文件├── conf.d   // confd自动发现配置│   ├── blackbox_prom.toml│   └── blackbox_config.toml└── templates   // confd模板文件    ├── blackbox_prom.tmpl    └── blackbox_config.tmpl 2)blackbox_exporter模板和自动发现配置
$ cat templates/blackbox_config.tmpl modules:{{- range $index, $info := getvs "/prom/local/blackbox/prod/*" -}} {{- $data := json $info -}} {{- if ne $index 0 }}{{- end }} {{ $data.name }}: prober: http timeout: 15s http: valid_status_codes: [200,301,302] preferred_ip_protocol: "ip4" method: {{ $data.method}} {{- if $data.headers }} headers: {{- range $data.headers }} {{.key}}: {{.val}} {{- end }} {{- end }} {{- if $data.body}} body: '{{ $data.body }}' {{- end }} {{- if $data.basic_auth}} basic_auth: {{- range $data.basic_auth }} {{.key}}: "{{.val}}" {{- end }} {{- end }}{{- end }} icmp: prober: icmp timeout: 5s
tcp_connect: prober: tcp
ssh_banner: prober: tcp tcp: query_response: - expect: "^SSH-2.0-" - send: "SSH-2.0-blackbox-ssh-check"
$ cat conf.d/blackbox_config.toml [template]src = "blackbox_config.tmpl"dest = "/etc/confd/config.yaml"mode = "0777"keys = ["/prom/local/blackbox/prod",]reload_cmd="kubectl create configmap blackbox-exporter-config --from-file=/etc/confd/config.yaml --dry-run=client -o yaml | kubectl apply -f - -n monitoring;kubectl -n monitoring set env deploy/blackbox-exporter randomstr=$(date +%s | sha256sum | base64 | head -c 8 ; echo)"# 这里采用了一个小技巧,通过一个修改环境变量的随机值来实现滚动更新blackbox_exporter实例
3)prometheus模板和自动发现配置
$ cat templates/blackbox_target.tmpl {{- range $index1, $info1 := getvs "/prom/local/blackbox/prod/*" -}} {{- $data1 := json $info1 -}} {{- if ne $index1 0 }}{{- end }} {{- range $index2, $info2 := getvs "/prom/local/blackbox_instance/prod/*"}} {{- $data2 := json $info2 -}} {{- if ne $index2 0 }}{{- end }} - job_name: {{ $data2.zone }}_{{ $data1.name }} scrape_interval: 45s metrics_path: /probe params: module: [{{ $data1.name }}] static_configs: - targets: {{- range $data1.target }} - {{.}} {{- end }} labels: {{- if $data2.labels -}} {{- range $data2.labels }} {{.key}}: {{.val}} {{- end }} {{- end }} relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: {{ $data2.address }} {{- end}}{{- end}}
$ cat conf.d/blackbox_prom.toml[template]src = "blackbox_prom.tmpl"dest = "/data/monitoring/blackbox_prom.yaml"mode = "0777"keys = ["/prom/local/blackbox/prod","/prom/local/blackbox_instance/prod",]reload_cmd = "cat /data/monitoring/*.yaml > /data/monitoring/all/prometheus-additional.yaml;kubectl create secret generic additional-scrape-configs --from-file=/data/monitoring/all/prometheus-additional.yaml --dry-run=client -o yaml | kubectl apply -f - -n monitoring"# 生成prometheus-operator的自动发现附加secrets配置

    4)blackbox_exporter部署

    blackbox_exporter部署在各个地域的K8S集群中,deployment如下:

apiVersion: apps/v1kind: Deploymentmetadata:  name: blackbox-exporter  namespace: monitoring  labels:    app: blackbox-exporterspec:  replicas: 1  selector:    matchLabels:      app: blackbox-exporter  template:    metadata:      labels:        app: blackbox-exporter    spec:      containers:      - name: blackbox-exporter        env:        - name: randomstr          value: "12345678"        image: harbor.xxx.local/library/blackbox_exporter:v0.21.1        imagePullPolicy: IfNotPresent        ports:        - containerPort: 9115        resources:          limits:            cpu: 1            memory: 2Gi          requests:            cpu: 100m            memory: 500Mi         volumeMounts:        - name: time-zone          mountPath: /etc/localtime        - name: certs          mountPath: /etc/ssl/certs        - name: config          mountPath: /etc/blackbox_exporter/config.yaml          subPath: config.yaml          readOnly: true      volumes:      - name: time-zone        hostPath:          path: /etc/localtime      - name: certs        hostPath:          path: /etc/ssl/certs      - name: config        configMap:          name: blackbox-exporter-config          defaultMode: 0755---apiVersion: v1 kind: Service metadata:   name: blackbox-exporter-svc  namespace: monitoringspec:   selector:     app: blackbox-exporter  ports:     - port: 9115      targetPort: 9115   type: ClusterIP

        注意:blackbox_exporter通过容器化部署后,无法实现https的探测,因为没有将系统默认证书挂载进去。需要将宿主机的/etc/ssl/certs目录挂载到容器才可以。

        至此,部署完成。

三、测试

    下发拨测接口

  • /prom/local/blackbox/prod
  • /prom/local/blackbox_instance/prod
  • /prom/shanghai/blackbox_instance/prod
// 添加两个无认证无Header的GET探测的接口:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox/prod/http_2xx '{"target":["http://airflow.xxx.xxx","https://rancher.xxx.xxx/"],"name":"http_2xx","method":"GET"}'
// 添加两个带Header的GET探测接口:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox/prod/http_2xx_02 '{"target":["https://www.baidu.com/","127.0.0.1:9115"],"name":"http_2xx_02","method":"GET","headers":[{"key":"Content-Type","val":"application/json"},{"key":"Authorization","val":"eyJ0eXAxxxbGciOiJIUzI1NiJ9.eyJ1aWQiOiJ4YWxlcnQtcGxhdGZvcm0iLCJleHAiOjMyMzMxMzM3NDMsImlhdCI6MTY1NjMzMzc0M30.qu4C-6LHrDGGFNbNjLvxOgaOMp3c_50L_rxMKHjzHMM"}]}'
// 添加两个带Header认证和body的POST探测接口 - 01:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox/prod/http_post_2xx_demo1 '{"target":["http://xxx.xxx.com/api/feishu/budget/1656408509608839/approve","http://xxx.xxx.com/api/feishu/budget/1656408509608849/approv123"],"name":"http_post_2xx_demo1","method":"POST","headers":[{"key":"Content-Type","val":"application/json"},{"key":"Authorization","val":"eyJ0eXAiOiJqd3QiLxxxiJIUzI1NiJ9.eyJ1aWQiOiJ4YWxlcnQtcGxhdGZvcm0iLCJleHAiOjMyMzMxMzM3NDMsImlhdCI6MTY1NjMzMzc0M30.qu4C-6LHrDGGFNbNjLvxOgaOMp3c_50L_rxMKHjzHMM"}],"body":"{"uid":"huangj19"}"}'
// 添加两个带Header认证和body的POST探测接口 - 02:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox/prod/http_post_2xx_demo2 '{"target":["http://xxx.xxx.com/api/feishu/budget/1656408509608839/approve","http://xxx.xxx.com/api/feishu/budget/1656408509608849/approv123"],"name":"http_post_2xx_demo2","method":"POST","headers":[{"key":"Content-Type","val":"application/json"},{"key":"Authorization","val":"eyJ0eXAiOiJqd3QiLCxxxI1NiJ9.eyJ1aWQiOiJ4YWxlcnQtcGxhdGZvcm0iLCJleHAiOjMyMzMxMzM3NDMsImlhdCI6MTY1NjMzMzc0M30.qu4C-6LHrDGGFNbNjLvxOgaOMp3c_50L_rxMKHjzHMM"}],"body":"{"uid":"huangj19", "reason": "demo"}"}'
// 添加两个带Header和basic_auth认证的POST探测接口:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox/prod/http_post_2xx_demo3 '{"target":["http://xxx.xxx.com/api/feishu/budget/1656408509608839/approve","http://xxx.xxx.com/api/feishu/budget/1656408509608849/approv123"],"name":"http_post_2xx_demo3","method":"POST","headers":[{"key":"Host","val":"login.example.com"}], "basic_auth":[{"key":"username","val":"username"},{"key":"password","val":"password"}]}'
// 字段说明:【必填】target: 列表类型,url接口地址【必填】name:和blackbox的module配置名一致,自定义唯一的名称【必填】method: http请求类型:GET/POST【可选】headers: 列表内嵌kv类型,http请求header信息【可选】body: kv json类型,http请求body信息【可选】basic_auth: http请求basic认证用户名和密码

        2)添加探测点blackbox_exporter实例:

// 添加两个zone区域的blackbox_exporter实例信息:$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/local/blackbox_instance/prod/10.199.161.172:9115 '{"zone":"gz_local","address":"10.199.161.172:9115","labels":[{"key":"zone","val":"gz_local"},{"key":"service","val":"blackbox_exporter"}]}'$ etcdctl --user user:xxx --endpoints=xxx.xxx.xxx.xxx:2379 put /prom/shanghai/blackbox_instance/prod/192.168.48.129:9115 '{"zone":"shanghai","address":"192.168.48.129:9115","labels":[{"key":"zone","val":"shanghai"},{"key":"service","val":"blackbox_exporter"}]}'
// 字段说明:【必填】zone: blackbox实例区域信息【必填】address:blackbox实例ip:port【可选】labels: blackbox实例的自定义labels

四、监控看板和告警

图片

    如图所示,可以较为清晰的看到接口对应的链路质量情况和在线情况。配置告警规则后实现告警通知。具体忽略。。。

五、对接业务当然需要一个前端系统

    这里呢?我去逛了一下github。发现了一个很不错的开源系统:https://github.com/starsliao/ConsulManager

这个开源的系统是基于consul管理的,我将其功能裁剪了一下,简单的做了下二开。实现了如下的功能,在页面测试业务接口,然后将其添加到监控项中。实现业务自动下发配置:

图片
图片

    当然,也可以将garana看板内嵌到页面中。方便看板的查看。

图片

    由于该开源项目是基于consul的,于是我考虑将etcd+confd自动发现改为consul+confd实现,并且prometheus原生支持consul做自动发现,也一并优化了prometheus配置文件冗长的问题。妙哉!

    Consul+confd和前后端二开修改就不展示了,还需要修修补补。不过我觉得这个方式实现一个拨测系统也挺好用了,推荐给您参考一下。

声明:文中观点不代表本站立场。本文传送门:https://eyangzhen.com/20852.html

(0)
联系我们
联系我们
分享本页
返回顶部