Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): add Prometheus metrics endpoint #134

Merged
merged 1 commit into from
Mar 10, 2025

Conversation

rickstaa
Copy link
Collaborator

@rickstaa rickstaa commented Mar 6, 2025

This pull request introduces a /metrics endpoint for Prometheus to scrape stream metrics, including FPS and average FPS per stream. Additionally, it adds the --stream-id-label argument, allowing users to optionally include the stream-id label in Prometheus metrics.

How to Test

  1. Create a Prometheus configuration file (prometheus.yml) with the following content:

    global:
      scrape_interval: 5s  # Scrape metrics every 5 seconds
    
    scrape_configs:
      - job_name: "comfystream_metrics"
        metrics_path: "/metrics"
        static_configs:
          - targets: ["localhost:8889"]  # Use "host.docker.internal:8889" on Mac/Windows Docker
  2. Start a Prometheus container with host network access:

    docker run --rm -p 9090:9090 --network host -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
  3. Verify the metrics endpoint is working by visiting:

  4. Verify stream id lable: You can also start comfystream with the --stream-id-label argument to add the stream ID as a prometheus lable.

Performance Evaluation

To ensure minimal impact on system performance, I ran the profiler from #80 before and after this change.

Test setup:

Performance Report

Baseline Performance (Before This PR)

  • Average FPS: 6 FPS
  • System Utilization:
    • Without py-spy:
      AVERAGE - CPU: 331.68%, RAM: 19147.63MB, GPU: 44.55%, VRAM: 4320.00MB
    • With py-spy:
      AVERAGE - CPU: 349.37%, RAM: 19181.66MB, GPU: 45.38%, VRAM: 4320.00MB
    • Profiling Visualization:
      Old pyspy profile

Performance After Adding Prometheus Metrics

  • Average FPS: 6 FPS
  • System Utilization:
    • Without py-spy:
      AVERAGE - CPU: 342.25%, RAM: 18897.88MB, GPU: 44.93%, VRAM: 4320.00MB
    • With py-spy:
      AVERAGE - CPU: 347.81%, RAM: 18899.74MB, GPU: 44.30%, VRAM: 4320.00MB
    • Profiling Visualization:
      New pyspy profile

Key Takeaways

No significant impact on FPS
Minimal CPU & RAM overhead from Prometheus integration
Enables real-time observability without performance degradation

@@ -0,0 +1,65 @@
"""General utility functions."""

import asyncio
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eliteprox I cleaned up the folder structure but this can also be done in a separate pull request.

@rickstaa rickstaa force-pushed the add_prometheus_metrics branch from fd12fbb to bce4e75 Compare March 6, 2025 11:29
server/app.py Outdated
@@ -88,18 +92,36 @@ async def _calculate_fps_loop(self):
current_time = time.monotonic()
if self._last_fps_calculation_time is not None:
time_diff = current_time - self._last_fps_calculation_time
self._fps = self._fps_interval_frame_count / time_diff
self._fps = (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small edge case.

server/app.py Outdated
self._fps_measurements.append(
{
"timestamp": current_time - self._fps_loop_start_time,
"fps": self._fps,
}
) # Store the FPS measurement with timestamp

# Reset start_time and frame_count for the next interval.
# Store the average FPS over the last minute.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved here to prevent redundant calculations and makes it available for prometheus serving.

@@ -408,11 +432,23 @@ async def on_shutdown(app: web.Application):
app.router.add_post("/prompt", set_prompt)

# Add routes for getting stream statistics.
stream_stats = StreamStats(app)
app.router.add_get("/streams/stats", stream_stats.collect_all_stream_metrics)
stream_stats_manager = StreamStatsManager(app)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed for consistency.

@rickstaa rickstaa force-pushed the add_prometheus_metrics branch from bce4e75 to c0496e1 Compare March 6, 2025 12:06
@rickstaa rickstaa requested review from hjpotter92 and eliteprox March 6, 2025 12:07
@rickstaa rickstaa marked this pull request as ready for review March 6, 2025 12:07
@rickstaa rickstaa force-pushed the add_prometheus_metrics branch 2 times, most recently from 597b10d to 287967c Compare March 6, 2025 12:34
@eliteprox
Copy link
Collaborator

This change will require a rebase and refactor based on #141

@rickstaa rickstaa marked this pull request as draft March 7, 2025 20:17
@rickstaa rickstaa force-pushed the add_prometheus_metrics branch from 287967c to c48ac38 Compare March 10, 2025 07:50
This commit introduces a `/metrics` endpoint for Prometheus to scrape stream
metrics, including FPS and average FPS per stream. Additionally, it adds the
`--stream-id-label` argument, allowing users to optionally include the `stream-id`
label in Prometheus metrics.

Co-authored-by: jpotter92 <git@hjpotter92.email>
@rickstaa rickstaa force-pushed the add_prometheus_metrics branch from d10a46f to 5a7fb2e Compare March 10, 2025 08:54
@rickstaa rickstaa requested a review from hjpotter92 March 10, 2025 08:56
@rickstaa rickstaa marked this pull request as ready for review March 10, 2025 09:10
Copy link
Collaborator

@eliteprox eliteprox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eliteprox eliteprox merged commit 76d820c into main Mar 10, 2025
1 check passed
@rickstaa rickstaa deleted the add_prometheus_metrics branch March 10, 2025 17:00
ryanontheinside pushed a commit to ryanontheinside/comfystream_inside that referenced this pull request Mar 12, 2025
commit 37a24c1
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Tue Mar 11 17:29:14 2025 -0400

    Update file path in README.md (yondonfu#156)

commit 8f7ce22
Author: Rick Staa <rick.staa@outlook.com>
Date:   Tue Mar 11 22:28:57 2025 +0100

    fix(dev): handle NoneType error in monitor script (yondonfu#155)

    This commit fixes a NoneType error in the resource monitoring script that occurs
    when certain processes lack a name.

commit fc0f6ef
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Tue Mar 11 13:02:41 2025 -0400

    fix: prevent `AssertionError` by setting `max_workers` to 1 (yondonfu#154)

    * change max_workers to 1, add error handling to VideoStreamTrack and AudioStreamTrack

commit f0200a2
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Tue Mar 11 10:09:30 2025 -0400

    chore: update ComfyUI-TensorRT node to origin branch (yondonfu#153)

commit 7513393
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Mon Mar 10 17:04:22 2025 -0400

    chore(deps): bump docker/login-action from 2 to 3 (yondonfu#149)

    Bumps [docker/login-action](https://github.com/docker/login-action) from 2 to 3.
    - [Release notes](https://github.com/docker/login-action/releases)
    - [Commits](docker/login-action@v2...v3)

    ---
    updated-dependencies:
    - dependency-name: docker/login-action
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 098fae6
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Mon Mar 10 15:52:10 2025 -0400

    fix: remove node package install from launch (yondonfu#151)

    * remove node package install from launch (yondonfu#3)
    * fix npm package install for devcontainer

commit 8d21037
Author: Rick Staa <rick.staa@outlook.com>
Date:   Mon Mar 10 20:30:02 2025 +0100

    fix: include Prometheus client dependency in project.toml (yondonfu#152)

    This commit ensures that the `prometheus-client` dependency is explicitly
    specified in both `requirements.txt` and `pyproject.toml` for consistency
    and proper dependency management.

commit 76d820c
Author: Rick Staa <rick.staa@outlook.com>
Date:   Mon Mar 10 17:56:32 2025 +0100

    feat(server): add Prometheus metrics endpoint (yondonfu#134)

    This commit introduces a `/metrics` endpoint for Prometheus to scrape stream
    metrics, including FPS and average FPS per stream. Additionally, it adds the
    `--stream-id-label` argument, allowing users to optionally include the `stream-id`
    label in Prometheus metrics.

    Co-authored-by: jpotter92 <git@hjpotter92.email>

commit 8aa9baa
Author: Rick Staa <rick.staa@outlook.com>
Date:   Mon Mar 10 15:02:29 2025 +0100

    feat: add resource profiler script (yondonfu#80)

    * feat(dev): add profiler script for resource usage tracking
    This commit adds a lightweight profiler script for developers working on
    ComfyStream to monitor resource usage and compare it against previous runs.

commit c343599
Author: Varshith Bathini <varshith15@gmail.com>
Date:   Sat Mar 8 03:00:37 2025 +0530

    feat: multi prompt dynamic node update (yondonfu#93)

    * feat: multi prompt dynamic node update

    * fix formatting

    ---------

    Co-authored-by: Elite <john@eliteencoder.net>

commit 98d4309
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 16:20:41 2025 -0500

    chore: reorganize workflows, add cliptext model to workflows (yondonfu#125)

    * reorganize workflows for easier indexing, add cliptext model to workflows for reliability, add 512x512 image for depthmap accelerated workflow, add florence-sam2 workflow + workflows using ConditioningConcat
    ---------
    Co-authored-by: ryanontheinstide <ryanfosdick87@gmail.com>

commit a2913c6
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 16:19:07 2025 -0500

    Revert "chore(deps): bump tailwind-merge from 2.6.0 to 3.0.2 in /ui (yondonfu#105)" (yondonfu#143)

    This reverts commit fac67e5.

commit 167efd3
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 16:18:39 2025 -0500

    Revert "chore(deps-dev): bump tailwindcss from 3.4.17 to 4.0.12 in /ui (yondonfu#142)" (yondonfu#144)

    This reverts commit a67a741.

commit a67a741
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Fri Mar 7 15:41:25 2025 -0500

    chore(deps-dev): bump tailwindcss from 3.4.17 to 4.0.12 in /ui (yondonfu#142)

    Bumps [tailwindcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss) from 3.4.17 to 4.0.12.
    - [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
    - [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/tailwindlabs/tailwindcss/commits/v4.0.12/packages/tailwindcss)

    ---
    updated-dependencies:
    - dependency-name: tailwindcss
      dependency-type: direct:development
      update-type: version-update:semver-major
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit fac67e5
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Fri Mar 7 15:37:03 2025 -0500

    chore(deps): bump tailwind-merge from 2.6.0 to 3.0.2 in /ui (yondonfu#105)

    Bumps [tailwind-merge](https://github.com/dcastil/tailwind-merge) from 2.6.0 to 3.0.2.
    - [Release notes](https://github.com/dcastil/tailwind-merge/releases)
    - [Commits](dcastil/tailwind-merge@v2.6.0...v3.0.2)

    ---
    updated-dependencies:
    - dependency-name: tailwind-merge
      dependency-type: direct:production
      update-type: version-update:semver-major
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit f8e684c
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 15:25:33 2025 -0500

    fix default camera bug (yondonfu#139)

commit fadf199
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 15:25:12 2025 -0500

    fix: remove missing startup script (yondonfu#138)

    * fix missing startup script

commit a31204e
Author: Rick Staa <rick.staa@outlook.com>
Date:   Fri Mar 7 19:12:16 2025 +0100

    feat(ui): add settings search queries (yondonfu#91)

    * feat(ui): add settings search queries
    This commit gives users the ability to set the stream settings using query parameters.
    ---------
    Co-authored-by: ryanontheinstide <ryanfosdick87@gmail.com>
    Co-authored-by: Elite <john@eliteencoder.net>

commit 5725fb4
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Fri Mar 7 12:12:58 2025 -0500

    chore(deps-dev): bump husky from 8.0.3 to 9.1.7 in /ui (yondonfu#104)

    Bumps [husky](https://github.com/typicode/husky) from 8.0.3 to 9.1.7.
    - [Release notes](https://github.com/typicode/husky/releases)
    - [Commits](typicode/husky@v8.0.3...v9.1.7)

    ---
    updated-dependencies:
    - dependency-name: husky
      dependency-type: direct:development
      update-type: version-update:semver-major
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit a38cb3e
Author: Rick Staa <rick.staa@outlook.com>
Date:   Fri Mar 7 18:08:37 2025 +0100

    refactor(server): improve FPS Stats collection logic (yondonfu#141)

    This commit extracts the FPS statistics collection into its own class to keep
    the `VideoStreamTrack` implementation cleaner and more maintainable. This also
    makes the logic reusable across different components.

commit 8622a31
Author: hjpotter92 <hjpotter92@users.noreply.github.com>
Date:   Fri Mar 7 18:14:23 2025 +0530

    workflows: Disable github action unless running in livepeer fork (yondonfu#140)

    Rename ui-kit release workflow

commit a55e3f3
Merge: a65a0f3 c990723
Author: hjpotter92 <hjpotter92@users.noreply.github.com>
Date:   Fri Mar 7 13:07:10 2025 +0530

    Merge pull request yondonfu#132 from livepeer/main

    backporting from `livepeer/` fork

commit a65a0f3
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Thu Mar 6 22:14:39 2025 -0500

    remove multi-controlnet patch (yondonfu#136)

commit c990723
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Fri Mar 7 00:10:58 2025 +0000

    fix whitespace

commit 5bcc444
Merge: b298abe f1b0fb1
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Thu Mar 6 10:07:48 2025 -0500

    Merge branch 'main' into main

commit b298abe
Author: hjpotter92 <hjpotter92@users.noreply.github.com>
Date:   Thu Mar 6 11:53:52 2025 +0530

    Add workflow for building comfyui-base images (yondonfu#2)

    * docker: Add workflow for building comfyui-base images

    * Add attestations to built docker images

commit f1b0fb1
Author: Rick Staa <rick.staa@outlook.com>
Date:   Wed Mar 5 23:43:35 2025 +0100

    fix: ensure patched torch graph is always synced on inference errors (yondonfu#129)

    * fix: ensure patched torch graph is always synced on inference errors

    This commit ensures that the patched torch graph remains synchronized
    even when an error occurs during inference, preventing potential
    inconsistencies and adds logging for controlnet tensor cloning errors
    ---------

    Co-authored-by: Elite <john@eliteencoder.net>

commit 93c3d35
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Wed Mar 5 16:49:57 2025 -0500

    fix mediapipe 0.10.20 conflict by unpinning protbuf in ComfyUI TensorRT (yondonfu#126)

commit 5ed95c2
Author: Rick Staa <rick.staa@outlook.com>
Date:   Wed Mar 5 22:03:19 2025 +0100

    refactor(dev): improve Ansible ComfyUI password behavior (yondonfu#127)

    This commit ensures that a unique password is generated on each run unless
    the user specifies a password themselves, making the deployment more secure.
    It allows users to provide their own password via extra-vars or environment
    variables while automatically generating a random password if none is provided.
    It also improves the ComfyUI caddy file template name.

commit d98e1b5
Author: Rick Staa <rick.staa@outlook.com>
Date:   Wed Mar 5 19:35:09 2025 +0100

    feat: add stream stats endpoint (yondonfu#48)

    Adds a new stream stats endpoint which can be used to retrieve the fps metrics in a way that doesn't affect performance.
    ---------

    Co-authored-by: Evan Mullins <evancmullins@gmail.com>

commit c5d5e48
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Wed Mar 5 13:17:09 2025 -0500

    resolve missing folder error (yondonfu#128)

commit 0cee22c
Author: Rick Staa <rick.staa@outlook.com>
Date:   Tue Mar 4 22:27:54 2025 +0100

    feat(dev): add Ansible playbook for ComfyStream setup (yondonfu#114)

    This commit adds an Ansible playbook for automated ComfyStream setup, switches Caddy release to `stable`, and prevents cloud-init from duplicating import lines. Also introduces a `--bare-vm` option for deploying a clean TensorDock instance without ComfyStream and refines cloud-init template behavior for better reliability.

commit d6b84ed
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Tue Mar 4 12:38:53 2025 -0500

    download latest ui files when missing (yondonfu#116)

commit 258afb8
Merge: 342e302 4dbbd13
Author: hjpotter92 <hjpotter92@users.noreply.github.com>
Date:   Tue Mar 4 16:50:50 2025 +0530

    Merge pull request yondonfu#1 from livepeer/feature/docker-builds

    workflows: Automating docker image build pipeline

commit 4dbbd13
Author: hjpotter92 <git@hjpotter92.email>
Date:   Tue Mar 4 15:26:04 2025 +0530

    dockerfile: Reduce layers by compining `RUN` stages

commit e7900e5
Author: hjpotter92 <git@hjpotter92.email>
Date:   Tue Mar 4 11:22:36 2025 +0530

    docker: Use self-hosted runner for building docker image

commit 9d6c79f
Merge: 6fdcac7 342e302
Author: hjpotter92 <hjpotter92@users.noreply.github.com>
Date:   Tue Mar 4 11:03:22 2025 +0530

    Merge branch 'main' into feature/docker-builds

commit 342e302
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Mon Mar 3 09:45:18 2025 -0500

    add publisher-id for registry publication (yondonfu#107)

commit 52f1327
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Mon Mar 3 09:44:52 2025 -0500

    release: update version to 0.0.3 (yondonfu#109)

    * Update pyproject.toml and package.json version to 0.0.3

    * workflows: Update release workflow to be more precise for ui files

    ---------

    Co-authored-by: hjpotter92 <git@hjpotter92.email>

commit d7bf4b5
Author: John | Elite Encoder <john@eliteencoder.net>
Date:   Mon Mar 3 09:34:33 2025 -0500

    keep empty static folder (yondonfu#113)

commit 6fdcac7
Author: hjpotter92 <git@hjpotter92.email>
Date:   Mon Mar 3 12:26:20 2025 +0530

    workflow: Use external github action for cleaning up disk space

commit 3c83e6a
Author: hjpotter92 <git@hjpotter92.email>
Date:   Mon Mar 3 11:28:34 2025 +0530

    workflows: Testing out github builds for docker images
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants