Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define ILM policy for Kibana APM data #124147

Open
lizozom opened this issue Jan 31, 2022 · 4 comments
Open

Define ILM policy for Kibana APM data #124147

lizozom opened this issue Jan 31, 2022 · 4 comments
Labels

Comments

@lizozom
Copy link
Contributor

lizozom commented Jan 31, 2022

We recently started collecting APM data for kibana and kibana front end.
At this stage we're collecting it for a subset of our monitoring deployments, but the longer term goal is to sample APM stats for all customer deployments, to allow us to monitor them better as well as troubleshoot performance issues in production.

Kibana APM data size

  • We are currently sampling 10% of kibana-frontend transactions
  • kibana transactions are not sampled (100% reported)
  • Since we are running pre-8.x, unsampled transactions are not dropped (should change that)

On us-east-1 region, kibana generates ~5m records a day. kibana-frontend generates a negligible amount of records (usage is low for these clusters). Given an average document size is 1.5KB, this would result in APM data for kibana weighing 7.5 GB per day for a single region.

For reference, the allocator generates ~680m records a day (>100GB a day) on the same region. This means that the kibana data is negligible in size compared to the rest of the data in these indices.

ILM Policy

While it's important to set up an ILM policy for Kibana APM data, since it's size is negligible in comparison to other services, we can ignore this for now. In the longer term, the cloud observability team plans to roll all data older than 7 days in searchable snapshots.

Some interesting questions to consider

Can we define an ILM policy per service?

Once we upgrade to 8.x and use data streams, each service will have it's own stream. We would then be able to control each stream's ILM policy separately, if we choose to.

What should be the ILM policy for the kibana info? Who owns it?

Need to identify owners

How do we make sure that this policy is defined on all monitoring APM servers?

How are those deployed across the monitoring clusters?

@lizozom lizozom self-assigned this Jan 31, 2022
@botelastic botelastic bot added the needs-team Issues missing a team label label Jan 31, 2022
@lizozom
Copy link
Contributor Author

lizozom commented Feb 2, 2022

@nikulinivan Do you know who were the people involved in defining the ILM policy for APM data?

@simitt
Copy link
Contributor

simitt commented Feb 10, 2022

Once we upgrade to 8.x and use data streams, each service will have it's own stream.

Data from all services will generally end up in the same data streams for traces* and logs (errors), except for metrics will be sent to data streams per service, and trace events collected by rum clients are stored in traces-apm.rum_traces*.

See apm-data-streams for more details.

@lizozom
Copy link
Contributor Author

lizozom commented Feb 14, 2022

Thanks for the input.

I think that as we move forward to collect data from customer deployments, we'll find the need to be able to customize this, but at the moment, I think this is not a high priority. 🙏🏻

@cjcenizal cjcenizal added the Team:APM - DEPRECATED Use Team:obs-ux-infra_services. label Feb 14, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants