How to create fast LogQL queries to filter terabytes of logs per second with FusionReactor

Related Articles

How to create fast LogQL queries for filtering terabytes of logs per second with FusionReactor

Performance is important when retrieving or analyzing data. This is why the need to create fast queries that filter terabytes of logs is critical because fast retrieval is as good as efficient queries.

When queries are optimized for better and faster performance, it saves time for developers and database administrators.

Many query languages ​​can scan logs and retrieve data from a database. However, when it comes to logging, LogQL is the preferred query language for FusionReactor.

FusionReactor with LogQL makes it easy to create fast queries for filtering terabytes of logs per second. This article will detail the key concepts and give you simple tips for creating quick queries in seconds.

There are three main types of filters that you can use to create quick queries in FusionReactor:

  • Labels Labels
  • Line filters
  • Label filters

The main function of a filter adapter is to restrict a query scan to a specific set of label values. Therefore workloads such as cluster, container and namespace are defined by labels, which facilitates the division and division of the data into several dimensions.

For example, workloads can be set to;

  • Application within all clusters
  • Application within the namespace of the developers within all the clusters
  • Production names space

Furthermore, it is also important to know that a compatible match can help reduce the number of logs to the amount you intend to look for. We can refer to this feature of compatible labels as your first line of defense when creating quick queries to filter large volumes of logs per second. Also, good label hygiene foot application is very important when using FusionReactor.

Also, to avoid pulling all the index data into your query list, you will need at least one equal match cluster = “us-central1 ″. For cases with more than one verb like container = ~ “promtail You can use a single regax match to complete the task with FusionReactor.

Let’s take a look at a good and bad show of matches:

Bad show

job = ~ “. * / queue”

namespace! ~ “dev -. *”

are good:

cluster = “us-central1 ″

container = “istio”

cluster = ”us-central1 ″, container = ~” agent

When creating ultra-fast queries for terabyte filtering of logs per second, Line filters are ready to work. They can be considered the second line of defense but not entirely necessary. Does your diary contain (| =) Or (! =Strings, the line filter does the job, and you can also use RE2 regexes to customize (| ~) Or incompatible (! ~Template.

The order of things is important when using line filters. This allows you to chain multiple filters in the filter filter quickly and easily. So the ones that filter the most appear at the front and then apply rag files to filter the ones slower. It is essential to consider the equality and inequality of some lines as they are likely to slow down the rex. However, it is customary to exercise equality and inequality on many lines.

As usual, some lines turn out to be abnormal. FusionReactor can not optimize lines like ~ “Error | Fatal” Will not perform regular files.

Once you know what you are looking for, a good way to start is by adding a filter that fits your search. for example, | = “Error”. Once you have done that, you can include inequality to unwanted lines, until you have something like| = “Error”! = “Timeout”! = “Canceled” | ~ “Failed. *”! = “Found in memory”.

A good way to go is to start by adding a filter that matches what you are looking for, for example, follow it, then add more and more inequality to remove what you do not want, until eventually there is something like | = “Error”! = “Deadline”! = “Canceled” | ~ “Failed. *”! = “Cached”

Do not be shocked by most of your mistakes Cache; What you can do is move it to the front of the line ! = “memcached” | = “Error”! = “Timeout”! = “Canceled” | ~ “Failed. *”. This solves the problem by running consecutive filters less often.

Another useful feature of line filters in log filtering is that you can browse for tracking ID, IP and IDs in general. Let’s look at this query: namespace = “prod” | = “traceID = 2e2er8923100 Which is a good query. We can build on this query by adding | ~ “/api/v.+/query” After the ID filter to prevent the execution of the query in each row in the namespace of the production.

Label filters are usually the slowest of the three filters. It is recommended to use label filters when all hope is lost when the other two filters, or when you are trying to create complex comparisons that require extraction and conversion of label values ​​to another format.

Moreover, with surgeons like | json Or | logfmt, You can use label filters without extracting them. So for example, job = “ingress / nginx” | status_code> = 400 and cluster = ”us-central2 ″ Works fine. In our previous post, we explained that while using surgeons, the | json and | logfmt Quick surgeons, slow normal expression. Therefore, it is important when using a surgeon, and it is important to pre-line your lines with a linear filter to speed up the process and save time. Let’s take a look at this example in our Go application with FusionReactor; We can keep track of all the logs attached to this file and the line number – caller = metrics.go: 83

level=info ts=2020-12-07T21:03:22.885781801Z caller=metrics.go:83 org_id=29 traceID=4078dafcbc079822 latency=slow query="cluster="ops-tools1",job="loki-ops/querier" != "logging.go" != "metrics.go" |= "recover"" query_type=filter range_type=range length=168h0m1s step=5m0s duration=54.82511258s status=200 throughput=8.3GB total_bytes=454GB

It’s a little different when you want to filter slow requests. Using the file and row documenting the delay as a line filter is the best option.

namespace="loki-ops",container="query-frontend" |= "caller=metrics.go:83" | logfmt | throughput > 1GB and duration > 10s and org_id=29



Please enter your comment!
Please enter your name here

Popular Articles