Kubernetes APM链路追踪Skywalking

随着RPC框架、微服务、云计算、大数据的发展,业务的规模和深度相比过往也都在增加。一个业务可能横跨多个模块/服务/容器,依赖的中间件也越来越多,其中任何一个节点出现异常,都可能导致业务出现波动或者异常,这就导致服务质量监控和异常诊断/定位变得异常复杂。于是催生了新的业务监控模式:调用链跟踪系统APM

在诸多优秀的开源APM产品中SkywalkingPinpoint脱颖而出,两款产品都通过字节码注入的方式,实现了对代码完全无任何侵入。对比如下:

前面我们介绍过单纯Docker方式(docker-compose)部署Pinpoint, 可以提供参考。本节我们介绍在Kubernetes上部署Skywalking。

1、Helm3

1
2
3
curl -LO https://get.helm.sh/helm-v3.2.4-linux-amd64.tar.gz
tar -zxf helm-v3.2.4-linux-amd64.tar.gz
cp linux-amd64/helm /usr/local/bin/helm3

2、服务端

Skywalking后端存储,使用EFK日志系统的ES集群。注意index加前缀区分
详细的Elasticsearch集群部署可以参考:Kubernetes日志系统EFK

1
2
3
4
5
6
7
8
9
10
11
12
13
cd ~/k8s/helm/charts
git clone https://github.com/apache/skywalking-kubernetes.git
cd skywalking-kubernetes/chart
helm dep up skywalking
# 创建namespace
kubectl create ns skywalking
# 准备values文件, 详见Values
vim skywalking/values.yaml
#
helm3 install skywalking skywalking -n skywalking --values ./skywalking/values.yaml
helm3 -n skywalking list
helm3 -n skywalking delete skywalking
helm3 -n skywalking upgrade skywalking --values ./skywalking/values.yaml

Helm Values

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
oap:
name: oap
dynamicConfigEnabled: false
image:
repository: apache/skywalking-oap-server
tag: 8.1.0-es7
pullPolicy: IfNotPresent
storageType: elasticsearch7 # 存储类型es7
ports:
grpc: 11800
rest: 12800
replicas: 2
service:
type: ClusterIP
javaOpts: -Xmx2g -Xms2g
antiAffinity: "soft"
nodeAffinity: {}
nodeSelector: {}
tolerations: []
resources: {}
env:
SW_NAMESPACE: "skywalking" # es索引前缀skywalking_, _下划线会自动加上
ui:
name: ui
replicas: 1
image:
repository: apache/skywalking-ui
tag: 8.1.0
pullPolicy: IfNotPresent
ingress:
enabled: true
annotations: {}
path: /
hosts:
- skywalking.boer.xyz # ingress地址
tls: []
elasticsearch:
enabled: false # 关闭内置es,我们使用EFK日志系统的ES集群
config:
port:
http: 9200
host: "elasticsearch-logging.logging.svc" # 日志系统ES地址
user: "elastic"
password: "<your-es-password>"

3、客户端

制作skywalking-agent镜像

1
2
3
4
5
6
cd ~/k8s/apps/skywalking-agent
tar -zxf apache-skywalking-apm-es7-8.1.0.tar.gz
cp apache-skywalking-apm-bin-es7/agent agent
vim Dockerfile # 准备Dockerfile, 详见Dockerfile
docker build -t registry.boer.xyz/public/skywalking-agent:8.1.0 .
docker push registry.boer.xyz/public/skywalking-agent:8.1.0

Dockerfile

1
2
3
4
FROM busybox:latest
ENV LANG=C.UTF-8
WORKDIR /usr/skywalking/agent
COPY agent/ .

skywalking-agent配置

1
2
3
4
5
6
7
# vim agent/config
agent.service_name=${SW_AGENT_NAME:Your_ApplicationName} # 服务名:区分不同服务,通过环境变量设置
agent.instance_name=${HOSTNAME} # 实例名:区分多实例,取Pod主机名
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:skywalking-oap.skywalking.svc:11800} # 服务端地址
logging.file_name=${SW_LOGGING_FILE_NAME:skywalking-api.log}
logging.level=${SW_LOGGING_LEVEL:INFO}
logging.max_file_size=${SW_LOGGING_MAX_FILE_SIZE:31457280}

4、使用示例

使用skywalking-agent一般会想到两种方法:

  • 将 agent 包构建到已经存在的基础镜像中
  • 通过initContainer方式拷贝Agent

initContainer方式将skywalking-agent拷贝到应用Pod中,无需修改基础JVM镜像,所以更推荐此方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
apiVersion: apps/v1
kind: Deployment
metadata:
name: produce-deployment
annotations:
kubernetes.io/change-cause: <CHANGE_CAUSE>
spec:
selector:
matchLabels:
app: produce
replicas: 2
template:
metadata:
labels:
app: produce
spec:
initContainers:
- image: registry.boer.xyz/public/skywalking-agent:8.1.0
name: skywalking-agent
imagePullPolicy: IfNotPresent
command: ['sh']
args: ['-c','cp -r /usr/skywalking/agent/* /skywalking/agent']
volumeMounts:
- mountPath: /skywalking/agent
name: skywalking-agent
containers:
- name: produce
image: <IMAGE>:<IMAGE_TAG>
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /usr/skywalking/agent
name: skywalking-agent
ports:
- containerPort: 10080
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "600m"
env:
- name: ENVIRONMENT
value: "pro"
- name: SW_AGENT_NAME # sw服务名
value: "springboot-produce"
- name: JVM_OPTS
value: "-Xms512m -Xmx512m -javaagent:/usr/skywalking/agent/skywalking-agent.jar"
livenessProbe:
httpGet:
path: /actuator/health
port: 10080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /actuator/health
port: 10080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
lifecycle:
preStop:
exec:
command:
- "curl"
- "-XPOST"
- "http://127.0.0.1:10080/actuator/shutdown"
imagePullSecrets:
- name: regcred
volumes:
- name: skywalking-agent
emptyDir: {}

5、Skywalking ES存储索引管理

详细iLM索引生命周期,见Kubernetes日志系统EFK一文

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
PUT _ilm/policy/skywalking-policy   
{
"policy": {
"phases": {
"warm": {
"min_age": "2d",
"actions": {
"forcemerge": {
"max_num_segments": 1
}
}
},
"delete": {
"min_age": "3d",
"actions": {
"delete": {}
}
}
}
}
}

PUT _template/skywalking-template
{
"index_patterns": ["skywalking_*"], // 这里完全匹配skywalking索引前缀,即SW_NAMESPACE
"settings": {
"number_of_shards": 3,
"number_of_replicas": 0,
"index.lifecycle.name": "skywalking-policy",
"index.refresh_interval": "30s",
"index.translog.durability": "async",
"index.translog.sync_interval":"60s"
}
}

6、The show


Ref


Kubernetes APM链路追踪Skywalking
https://www.boer.xyz/2020/08/16/k8s-apm-skywalking/
作者
boer
发布于
2020年8月16日
许可协议