NEAR AI Analytics

A collection of tools for benchmarking, evaluating, and analyzing agent performance metrics.

Repository Structure

/benchmarks: Tools for running benchmarks on AI models and agents
/canonical_metrics: Standard formats and tools for metrics collection, including:
- Metrics CLI for processing and transforming metrics
- Metrics Service API for querying and visualizing metrics data
/historic_performance: Dashboard for tracking performance over time
/evaluation: Dashboard and tools for evaluating agent or model performance

Primary Use Cases

1. Collect Metrics from Agent Runs

How to collect logs from NEAR AI Hub

2. Process Metrics

Transform, tune, aggregate, create csv table.

# Installation
cd canonical_metrics
python3.11 -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install

# Transform and aggregate metrics
metrics-cli tune /path/to/logs /path/to/tuned_logs --rename --ms-to-s
metrics-cli aggregate /path/to/tuned_logs /path/to/aggr_logs --filters "runner:not_in:local" --slices "agent_name"
metrics-cli aggregation-table /Users/me/.nearai/tuned_logs /Users/me/.nearai/table --filters "runner:not_in:local" --absent-metrics-strategy=nullify

3. Query Metrics via API

Run the metrics service to query and analyze metrics data:

# Start the metrics service
metrics-service --metrics-path /path/to/tuned_logs

# Query metrics via API
curl -X POST "http://localhost:8000/api/v1/table/aggregation" \
  -H "Content-Type: application/json" \
  -d '{
    "filters": ["runner:not_in:local"],
    "column_selections": ["/metrics/performance/"]
  }'

4. Run Historic Performance Dashboard

Run a web application for querying and visualizing analytics metrics data. Tools to browse logs, track and visualize agent performance over time.

cd historic_performance
npm install
npm start

Will open a dashboard at http://localhost:3000

The dashboard can also be used as a web component in other applications:

npm install @nearai/analytics-dashboard

import { Dashboard } from '@nearai/analytics-dashboard';

// Use with configuration
<Dashboard config={{
  views: ['table'], // Show only table view
  globalFilters: ['runner:not_in:local'], // Applied to all requests
  metricSelection: 'PERFORMANCE', // Metric selection
  viewConfigs: {
    table: {
      showParameters: ['prune_mode'], // Show only specific parameters
      refreshRate: 30 // Refresh every 30 seconds
    }
  }
}} />

5. Run Benchmarks and Evaluations

Execute popular and user-owned benchmarks to generate performance metrics. Run audit evaluations on agents.

6. Run Evaluation Dashboard

Visualize, analyze, and compare agent & model performances using the collected metrics.

Key Features

Canonical Metrics Format: Standardized format for consistent metrics across all agents
Flexible Aggregation: Group and aggregate metrics by various dimensions
Powerful Filtering: Filter metrics by runner, model, time ranges, and custom criteria
RESTful API: Easy integration with dashboards and other tools
Performance Tracking: Monitor latency, API usage, error rates, and custom metrics

Contributing

We welcome contributions! See individual component READMEs for specific development guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
canonical_metrics		canonical_metrics
evaluation		evaluation
historic_performance		historic_performance
integrations/nearai_registry/download_logs		integrations/nearai_registry/download_logs
.codespellrc		.codespellrc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
codespell_ignore.txt		codespell_ignore.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NEAR AI Analytics

Repository Structure

Primary Use Cases

1. Collect Metrics from Agent Runs

2. Process Metrics

3. Query Metrics via API

4. Run Historic Performance Dashboard

5. Run Benchmarks and Evaluations

6. Run Evaluation Dashboard

Key Features

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

nearai/analytics

Folders and files

Latest commit

History

Repository files navigation

NEAR AI Analytics

Repository Structure

Primary Use Cases

1. Collect Metrics from Agent Runs

2. Process Metrics

3. Query Metrics via API

4. Run Historic Performance Dashboard

5. Run Benchmarks and Evaluations

6. Run Evaluation Dashboard

Key Features

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages