8000 document how to use Kiali diagnostics for measuring performance · Issue #8449 · kiali/kiali · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

document how to use Kiali diagnostics for measuring performance #8449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jmazzitelli opened this issue May 22, 2025 · 8 comments · May be fixed by kiali/kiali.io#881
Open

document how to use Kiali diagnostics for measuring performance #8449

jmazzitelli opened this issue May 22, 2025 · 8 comments · May be fixed by kiali/kiali.io#881
Assignees

Comments

@jmazzitelli
Copy link
Collaborator

We should have a page on kiali.io that provides information on how a user can use Kiali diagnostics to help figure out performance issues. I'm thinking document things like:

  1. Instructions on how to enable Kiali trace logging.
  2. Instructions on how to enable Kiali logging in json format for easier querying and filtering (via jq or things like that).
  3. Some helpful jq queries (for json logs) and grep expressions (for text logs) that can find different things in the logs (like metric timings and API request times).
  4. Some helpful Prometheus queries to query Kiali metrics.
  5. The pprof stuff (we have some documentation somewhere, I just don't remember where).

There might be other stuff - comments welcome on what we should have in these docs.

Not sure what the title of this doc page should be or where it should be under kiali.io. Suggestions welcome.

@nrfox
Copy link
Contributor
nrfox commented May 22, 2025

Here's what I think would be very helpful to have. For a given request, like /api/namespaces/graph, how long does it take prom queries to run and how long did graph generation take. Even if Kiali only clearly logged those two things we could rule out whether slowdowns were happening in prom or slowdowns were happening in Kiali itself.

kubectl logs <kiali-pod> | grep request-id=d0n7gecvl4ec739vmneg

should show this.

@jmazzitelli
Copy link
Collaborator Author

Here's what I think would be very helpful to have. For a given request, like /api/namespaces/graph, how long does it take prom queries to run and how long did graph generation take. Even if Kiali only clearly logged those two things we could rule out whether slowdowns were happening in prom or slowdowns were happening in Kiali itself.

kubectl logs <kiali-pod> | grep request-id=d0n7gecvl4ec739vmneg

should show this.

We'd want to also document for the user what this request-id is and, more importantly, how to get one (they would need to look at the logs, find the request-ids somehow and pick an interesting one - one related to the graph generation, for example). So we'd want to document that portion too - it won't be enough just to say "grep for a request-id" because the first question they will ask is, "what request-id do I search for?"

One way I am thinking of documenting this is to have them look for route=GraphNamespaces (e.g. if they care about the graph generation performance) and in the results you can see all the logs for that route, and all the request-ids for them. From that list of request-ids, they can pick one. For example, if the logs are in json:

kubectl logs -n istio-system deployments/kiali | jq -R 'fromjson? | select(.route == "GraphNamespaces") | .["request-id"]' | sort -u

will output all the request-ids that requested a graph.

If the logs are in text (which is our default), then this does the same thing:

kubectl logs -n istio-system deployments/kiali | grep 'route=GraphNamespaces' | sed -n 's/.*request-id=\([^ ]*\).*/\1/p' | sort -u

They will return a list like this:

"d0n8a6sa8p9s73d224sg"
"d0n8a94a8p9s73d225g0"
"d0n8a9ca8p9s73d225lg"

Then people are going to ask "what are the different routes I can look at?"... here's how you can get those:

JSON:

kubectl logs -n istio-system deployments/kiali | jq -R 'fromjson? | select(.route) | .route' | sort -u

text:

kubectl logs -n istio-system deployments/kiali | grep -o 'route=[^ ]*' | cut -d= -f2 | sort -u

That will return a list like this:

"ClustersApps"
"Config"
"GraphNamespaces"
"MeshGraph"
"Status"

@jshaughn
Copy link
Collaborator

Just a note that you can also look at the logs IN Kiali. I think we should encourage users to inspect Kiali from the Kiali workload itself. The logs tab has nice filtering and highlighting. Although, has anyone checked to see if the new structured logging looks decent?

@jmazzitelli
Copy link
Collaborator Author

I forgot all about this docs page - we can just add to this rather than create a new one:

http://kiali.io/docs/configuration/debugging-kiali/

@nrfox
Copy link
Contributor
nrfox commented May 22, 2025

"ClustersApps"
"Config"
"GraphNamespaces"
"MeshGraph"
"Status"

Can we log the actual route like /api/namespaces/graph? That way users can open their browser's dev console and cross reference the network calls being made to what is being logged. Otherwise these names seem a little arbitrary.

@nrfox
Copy link
Contributor
nrfox commented May 22, 2025

Can we log the actual route like /api/namespaces/graph? That way users can open their browser's dev console and cross reference the network calls being made to what is being logged. Otherwise these names seem a little arbitrary.

Maybe that's what the URL handler is for and we can add that if the log level == trace. We'd probably want to exclude URLs for certain routes like the auth callback handlers if possible.

@jmazzitelli
Copy link
Collaborator Author
jmazzitelli commented May 22, 2025

Can we log the actual route like /api/namespaces/graph? That way users can open their browser's dev console and cross reference the network calls being made to what is being logged. Otherwise these names seem a little arbitrary.

Those names are the actual Route names themselves and is how we (devs) can correlate back that log message to the handler, e.g. https://github.com/kiali/kiali/blob/v2.10.0/routing/routes.go#L680

They get set here: https://github.com/kiali/kiali/blob/v2.10.0/routing/router.go#L370

We could add "route-pattern" that logs the Route.Pattern as defined here: https://github.com/kiali/kiali/blob/v2.10.0/routing/routes.go#L24 , e.g.

c = c.append(hlog.NewHandler(zerolog.With().Str("route", route.Name).Str("route-pattern", route.Pattern).Logger()))

These patterns (some of them anyway) have placeholders, so they will look something like this: "/api/namespaces/{namespace}/applications/{app}/versions/{version}/graph"

I put that in my last PR that is in flight: #8425

I just tested it - things would look like this:

2025-05-22T18:43:17Z TRC Node graph generation time duration=14.553485ms graph-kind=node graph-type=workload group=graph inject-service-nodes=true request-id=d0nn0hdr7vqs73clfj40 route=GraphService route-pattern=/api/namespaces/{namespace}/services/{service}/graph timer=GraphGenerationTime
2025-05-22T18:44:35Z TRC Namespace graph appender time appender=workloadEntry duration="300.624µs" group=graph namespace=bookinfo request-id=d0nn14tr7vqs73clfja0 route=GraphNamespaces route-pattern=/api/namespaces/graph timer=GraphAppenderTime

Note that we "could" log the actual URL - but the point was made earlier that this might have sensitive information, so we don't really want to log that. I think the Route.Pattern gets the user what he needs for the most part. Maybe we can consider logging the actual URL if, say, the logger has trace level enabled (that doesn't get around the "sensitive information" problem so we probably don't want to do that either).

jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 22, 2025
jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 24, 2025
jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 25, 2025
jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 25, 2025
@jmazzitelli
Copy link
Collaborator Author
  1. Instructions on how to enable Kiali trace logging.

We already have it - in the Debugging Kiali page (which is where this PR is adding its stuff).

  1. The pprof stuff (we have some documentation somewhere, I just don't remember where).

It's in this Debugging Kiali page.

Not sure what the title of this doc page should be or where it should be under kiali.io. Suggestions welcome.

I'm just adding to this Debugging Kiali page - so these decisions were already made for me :)

jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 25, 2025
jmazzitelli added a commit to jmazzitelli/kiali.io that referenced this issue May 25, 2025
@jmazzitelli jmazzitelli moved this from 📋 Backlog to 👀 In review in Kiali Sprint 25-08 | Kiali v2.11 May 25, 2025
@jshaughn jshaughn moved this from 👀 In review to 🏗 In progress in Kiali Sprint 25-08 | Kiali v2.11 May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging a pull request may close this issue.

3 participants
0