Improve data ingestion reliability and functionality 35/63435/1
authorearrage <eddie.arrage@huawei.com>
Wed, 10 Oct 2018 18:54:51 +0000 (11:54 -0700)
committerEddie Arrage <eddie.arrage@huawei.com>
Fri, 12 Oct 2018 01:07:26 +0000 (01:07 +0000)
commit29234dd20c49fe62734b723f1961c70ac6f1db08
tree8640c5b37db1e88c82e828f250ccfcdc04b8f26f
parentee2169ee4b8fb3539ad173fbc1557b54b2f2216f
Improve data ingestion reliability and functionality

- Modify deployment namespace to clover-system and account
for cassandra moving to the clover-system namespace
- Increase k8s compute resource assigned to cassandra to deal
with performance issues
- Add additional fields (user-agent, request/response size,
status codes) to span schema definition and modify primary keys
- Improve exception handling to prevent collect process from
crashing
- Minor changes to support tracing/monitoring with Istio 1.0
- Inhibit logging for debug messages
- Increase time back and number of traces to fetch in
each sampling interval to deal with Jaeger REST interface
returning trace data out of order under load
(tested to 300 conn/sec; 12K connections currently)
- Move trace insert into batch mode to cassandra
- Read visibility services to analyze from redis rather than
defaults (cloverctl, UI or clover-controller REST will set)
- Remove local directory copies in docker build, as image is
based on base clover container

Change-Id: Ibae98ef5057e52a6eeddd9ebbcfaeb644caec36c
Signed-off-by: earrage <eddie.arrage@huawei.com>
clover/collector/db/cassops.py
clover/collector/db/redisops.py
clover/collector/docker/Dockerfile
clover/collector/grpc/collector_client.py
clover/collector/grpc/collector_server.py
clover/collector/process/collect.py
clover/collector/yaml/manifest.template
clover/tools/yaml/cassandra.yaml