8000 feat: doc improvements by grutt · Pull Request #222 · hatchet-dev/hatchet · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: doc improvements #222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions frontend/docs/next.config.js
427E
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ const nextConfig = {
destination: '/home/:path*',
permanent: true,
},
{
source: "/ingest/:path*",
destination: "https://app.posthog.com/:path*",
permanent: true,
},
];
},
}
Expand Down
1 change: 1 addition & 0 deletions frontend/docs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
"nextra": "^2.13.2",
"nextra-theme-docs": "^2.13.2",
"postcss": "^8.4.33",
"posthog-js": "^1.111.2",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-tweet": "^3.2.0",
Expand Down
23 changes: 21 additions & 2 deletions frontend/docs/pages/_app.tsx
Original file line number Diff line number Diff line change
@@ -1,17 +1,36 @@
import type { AppProps } from "next/app";
// import { Inter } from "@next/font/google";
import posthog from "posthog-js";
import { PostHogProvider } from "posthog-js/react";

import "../styles/global.css";
import { useRouter } from "next/router";

// const inter = Inter({ subsets: ["latin"] });

// // Check that PostHog is client-side (used to handle Next.js SSR)
// if (typeof window !== "undefined" && process.env.NEXT_PUBLIC_POSTHOG_KEY) {
// posthog.init(process.env.NEXT_PUBLIC_POSTHOG_KEY, {
// // Enable debug mode in development
// api_host: "https://docs.hatchet.run/ingest",
// ui_host: "https://app.posthog.com",
// loaded: (posthog) => {
// if (process.env.NODE_ENV === "development") posthog.debug();
// },
// });
// }

function MyApp({ Component, pageProps }: AppProps) {
return (
<>
<PostHogProvider client={posthog}>
<main className="bg-[#020817]">
<Component {...pageProps} />
</main>
</>
</PostHogProvider>
);
}

export default MyApp;
function useEffect(arg0: () => () => void, arg1: undefined[]) {
throw new Error("Function not implemented.");
}
6 changes: 5 additions & 1 deletion frontend/docs/pages/_meta.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
{
"home": {
"title": "Home",
"title": "User Guide",
"type": "page"
},
"sdks": {
"title": "SDK Reference",
"type": "page"
},
"self-hosting": {
Expand Down
26 changes: 15 additions & 11 deletions frontend/docs/pages/home/_meta.json
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
{
"index": "Introduction",
"-- Getting Started": {
"type": "separator",
"title": "Getting Started"
"quickstart": "Quick Start",
"--guide": {
"title": "Guide",
"type": "separator"
},
"quickstart": "Quickstart",
"-- SDKs": {
"type": "separator",
"title": "SDKs"
"basics": "Working With Hatchet",
"tutorials": "Tutorials",
"features": "Features",
"--more": {
"title": "More",
"type": "separator"
},
"go-sdk": "Go SDK",
"python-sdk": "Python SDK",
"typescript-sdk": "TypeScript SDK"
}
"about": {
"title": "About Hatchet ↗",
"href": "https://hatchet.run"
}
}
6 changes: 6 additions & 0 deletions frontend/docs/pages/home/basics/_meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"overview":"Overview",
"steps": "Understanding Steps",
"workflows": "Understanding Workflows",
"workers": "Understanding Workers"
}
59 changes: 59 additions & 0 deletions frontend/docs/pages/home/basics/dashboard.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
{/* TODO needs work */}

# Using the Hatchet Dashboard

The Hatchet Dashboard is a powerful web-based interface that allows you to monitor, manage, and troubleshoot your workflows. It provides a centralized view of your workflows, their execution history, and performance metrics, enabling you to gain insights into the behavior and health of your system.

## Accessing the Dashboard

To access the Hatchet Dashboard, you need to have a Hatchet account and be logged in. Once logged in, you can navigate to the dashboard by clicking on the "Dashboard" link in the top navigation menu or by directly accessing the dashboard URL provided by your Hatchet instance.

## Dashboard Overview

The Hatchet Dashboard consists of several key sections and features:

1. **Workflow List**: The main page of the dashboard displays a list of all your workflows. Each workflow is represented by a card that shows its name, description, and key metrics such as the number of total runs, successful runs, and failed runs.

2. **Workflow Details**: Clicking on a workflow card takes you to the workflow details page. This page provides a comprehensive view of the selected workflow, including its definition, execution history, and performance metrics.

3. **Execution History**: The execution history section shows a list of all the runs of the selected workflow. Each run is listed with its status (success, failure, or in-progress), start time, duration, and input data. You can click on a specific run to view its details, including the input data, output data, and any logs or error messages.

4. **Logs and Debugging**: The dashboard provides access to the logs generated by your workflows and their individual steps. You can view the logs for a specific run or step to help diagnose issues and debug problems. The logs are searchable and filterable, making it easier to find relevant information.

5. **Worker Management**: The dashboard allows you to manage your workers directly from the interface. You can inspect or delete workflows that are no longer needed.

## Monitoring Workflows

One of the primary uses of the Hatchet Dashboard is to monitor the health and performance of your workflows. By regularly checking the dashboard, you can:

- Track the overall execution status of your workflows, identifying any failures or errors that need attention.
- Monitor the throughput and latency of your workflows, ensuring they are meeting performance expectations.
- Identify any bottlenecks or slowdowns in your workflows by analyzing the duration and resource usage of individual steps.
- Detect anomalies or unusual behavior by comparing the metrics and charts across different time periods or workflow versions.

## Troubleshooting and Debugging

When issues arise in your workflows, the Hatchet Dashboard is an invaluable tool for troubleshooting and debugging. You can:

- Inspect the execution history of a failed run to understand the sequence of events leading to the failure.
- View the input and output data of a specific run to identify any anomalies or unexpected values.
- Analyze the logs and error messages associated with a failed step to pinpoint the root cause of the issue.
- Use the search and filtering capabilities to find relevant log entries or error patterns across multiple runs.

## Best Practices for Using the Dashboard

To make the most of the Hatchet Dashboard, consider the following best practices:

1. **Regularly monitor your workflows**: Make it a habit to check the dashboard regularly to stay informed about the health and performance of your workflows. Set up alerts or notifications to get proactively notified of any issues or anomalies.

2. **Investigate failures promptly**: When you notice a failed run or an error in the dashboard, investigate it promptly to minimize the impact on your system. Use the debugging and troubleshooting features to identify and resolve the issue quickly.

3. **Analyze metrics and trends**: Use the metrics and charts provided by the dashboard to gain insights into the behavior and performance of your workflows over time. Look for trends, patterns, or anomalies that may indicate potential problems or opportunities for optimization.

4. **Collaborate with your team**: Share access to the dashboard with your team members and collaborate on monitoring, troubleshooting, and optimizing your workflows. Use the dashboard as a centralized platform for communication and knowledge sharing.

5. **Customize and extend**: Take advantage of any customization or extension capabilities provided by the Hatchet Dashboard. This may include creating custom metrics, dashboards, or integrations with other tools in your ecosystem.

By leveraging the features and best practices of the Hatchet Dashboard, you can effectively monitor, manage, and troubleshoot your workflows, ensuring their reliability, performance, and smooth operation.

Remember, the dashboard is a powerful tool, but it's not a substitute for good design and implementation practices. Use the insights gained from the dashboard to continuously improve your workflows, optimize their performance, and enhance their resilience.
14 changes: 14 additions & 0 deletions frontend/docs/pages/home/basics/overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Overview

This chapter provides an overview of the core components of Hatchet:

- **[Steps](./steps):** Individual, self-contained functions that execute specific tasks and return JSON-serializable results, forming the basic units of execution in Hatchet.
- **[Workflows](./workflows):** Declarative DAG definitions that organize steps into a coherent sequence or structure, managing the execution order and dependencies to achieve a specific outcome.
- **[Workers](./workers):** Long-lived runtimes that listen for instructions from the Hatchet engine to execute steps, running in your infrastructure to provide compute for task execution.
- **[Dashboard](./dashboard):** A web-based interface for managing and monitoring workflows, steps, and workers, providing visibility into the state of your distributed task execution.

## What's Next: Understanding Steps in Depth

Diving into the "Steps" section, we will explore how these fundamental units operate within Hatchet, including their characteristics, best practices for development, and their role in the broader context of distributed task execution.

[Continue to Understanding Steps →](./steps)
125 changes: 125 additions & 0 deletions frontend/docs/pages/home/basics/steps.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
import { Callout, Card, Cards, Steps, Tabs } from 'nextra/components'

# Understanding Steps in Hatchet

In Hatchet, steps are simply a function that satisfies the following signature:


<Tabs items={['Python', 'Typescript']}>
<Tabs.Tab>
```python
def my_step(context: Context) -> dict:
# Perform some operation
return output
```
</Tabs.Tab>
<Tabs.Tab>
```typescript
export const myStep = async (ctx: Context): Promise<object> => {
// Perform some operation
return output;
};
```
</Tabs.Tab>
</Tabs>


This function takes a single argument, `context`, which is an object that provides access to the workflow's input, as well as methods to interact with hatchet (i.e. logging). The function returns a JSON-serializable object, which represents the output of the step.

Often, Hatchet users will start by wrapping one large, esisting function into a step, and then break it down into smaller, more focused steps. This approach allows for better reusability and easier testing.


## Best Practices for Defining Independent Steps

A step in Hatchet is a self-sufficient function that encapsulates a specific operation or task. This independence means that while steps can be orchestrated into larger [workflows](./workflows), each step is designed to function effectively in isolation.

Here's some key best practices for defining independent steps in Hatchet:

- **Consistent Input and Output Shapes:** Each step can accept an input and produce an output. The input to a step is typically a JSON object, allowing for flexibility and ease of integration. The output is also a JSON-serializable object, ensuring compatibility and ease of further processing or aggregation. Wherever possible, use consistent input and output shapes to ensure that steps produce predictable and uniform results.
- **Reusability:** Due to their self-contained nature, steps can be reused across different workflows or even within the same workflow, maximizing code reuse and reducing redundancy.
- **Testing**: Steps can be easily tested in isolation, ensuring that they function as expected and produce the desired output. This testing approach simplifies the debugging process and ensures that steps are reliable and robust.
- **Logging**: Each step can log its operation, input, and output, providing valuable insights into the workflow's execution and enabling effective monitoring and troubleshooting. Logs can be streamed to the Hatchet dashboard through the [`context.log` method](../features/errors-and-logging.mdx).



### The Workflow Input Object

A step in Hatchet accepts a single workflow input, which is typically a JSON object and can be accessed through the `context` argument:

<Tabs items={['Python', 'Typescript']}>
<Tabs.Tab>
```python
def my_step(context: Context) -> dict:
data = context.workflow_input()
# Perform some operation
return output
```
</Tabs.Tab>
<Tabs.Tab>
```typescript
export const myStep = async (ctx: Context): Promise<object> => {
const data = ctx.workflowInput();
// Perform some operation
return output;
};
```
</Tabs.Tab>
</Tabs>


This input object can contain any data or parameters required for the step to perform its operation. By using a JSON object as the input, steps can be easily integrated and combined, as the input can be easily serialized and deserialized.

### The Return Object

A step in Hatchet returns any JSON-serializable object, which can be a simple value, an array, or a complex object. This flexibility allows steps to encapsulate a wide range of operations, from simple transformations to complex computations.

### "Thin" vs "Full" Payloads in Hatchet

Hatchet can handle inputs and result data in two primary formats: "thin" and "full" data payloads. Full data payloads include the full model data and all (or most) properties. Conversely, thin data payloads provide essential identifiers (i.e. GUIDs) and possibly minimal change details.

Upon the execution of a new task, here are the example data payloads Hatchet might dispatch:

Full data payload:

```json
{
"type": "task.executed",
"timestamp": "2022-11-03T20:26:10.344522Z",
"data": {
"id": "1f81eb52-5198-4599-803e-771906343485",
"type": "task",
"taskName": "Database Backup",
"taskStatus": "Completed",
"executionDetails": "Backup completed successfully at 2022-11-03T20:25:10.344522Z",
"assignedTo": "John Smith",
"priority": "High"
}
}
```

Thin data payload:

```json
{
"type": "task.executed",
"timestamp": "2022-11-03T20:26:10.344522Z",
"data": {
"id": "1f81eb52-5198-4599-803e-771906343485"
}
}
```

It's feasible to adopt a thin payload strategy while still including frequently utilized or critical fields, such as "taskName" in the thin payload, balancing necessity and efficiency.

The choice between thin and full data payloads hinges on the specific requirements: full data payloads offer immediate, comprehensive context, reducing the need for subsequent data retrieval. Thin data payloads, however, enhance performance, and adaptability especially in scalable, distributed environments like Hatchet.

**Payload Size Considerations**

While Hatchet allows for payloads upto 4mb, it's advisable to maintain smaller payload sizes, to minimize the processing burden on data consumers. For extensive data requirements, consider referencing data through links or URLs within the payload, allowing consumers to access detailed information only as needed. This approach aligns with efficient data handling and consumer-centric design principles in distributed systems.


## What's Next: Composing Steps into Workflows

While each step in Hatchet stands on its own they can be further utilized through composition into workflows. In the next section, we'll explore how to combine these independent steps into declarative workflow definitions, where Hatchet seamlessly manages their ordering and execution, enabling you to orchestrate complex processes with ease.

[Continue to Understanding Workflows →](./workflows)
37 changes: 37 additions & 0 deletions frontend/docs/pages/home/basics/workers.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{/* TODO revise this page */}
# Workers in Hatchet

While Hatchet manages the scheduling and orchestration, the workers are the entities that actually execute the individual steps defined within your workflows. Understanding how to deploy and manage these workers efficiently is key to leveraging Hatchet for distributed task execution.

## Overview of Workers

Workers in Hatchet are long-lived processes that await instructions from the Hatchet engine to execute specific steps. They are the muscle behind the brain, where Hatchet acts as the brain orchestrating what needs to be done and the workers carry out those tasks. Here's what you need to understand about workers:

- **Autonomy:** Workers operate independently across different nodes in your infrastructure, which can be spread across multiple systems or even different cloud environments.
- **Technology Agnostic:** Workers can be written in different programming languages or technologies, provided they can communicate with the Hatchet engine and execute the required steps.
- **Scalability:** You can scale your system horizontally by adding more workers, enabling Hatchet to distribute tasks across a wider set of resources and handle increased loads efficiently.

When you define a workflow in Hatchet, you register the steps or workflows that that node is capable of executing. The Hatchet engine then schedules these steps and assigns them to available workers for execution. The workers receive the instructions from the Hatchet engine, execute the steps, and report back the results to the engine when complete.

## Best Practices for Workers

To ensure that your Hatchet implementation is robust, scalable, and efficient, adhere to these best practices for setting up and managing your workers:

1. **Reliable Execution Environment:** Deploy your workers in a stable and reliable environment. Ensure that they have sufficient resources to execute the tasks without running into resource contention or other environmental issues.

2. **Monitoring and Logging:** Implement robust monitoring and logging for your workers. Keeping track of worker health, performance, and task execution status is crucial for identifying issues and optimizing performance.

3. **Graceful Error Handling:** Design your workers to handle errors gracefully. They should be able to report execution failures back to Hatchet and, when possible, retry execution based on the configured policies.

4. **Secure Communication:** Ensure that the communication between your workers and the Hatchet engine is secure, particularly if they are distributed across different networks or environments.

5. **Lifecycle Management:** Implement proper lifecycle management for your workers. They should be able to restart automatically in case of critical failures and should support graceful shutdown procedures for maintenance or scaling operations.

6. **Scalability Practices:** Plan for scalability by designing your system to easily add or remove workers based on demand. This might involve using containerization, orchestration tools, or cloud auto-scaling features.

7. **Consistent Updates:** Keep your worker implementations up to date with the latest Hatchet SDKs and ensure that they are compatible with the version of the Hatchet engine you are using.

## Conclusion

While Hatchet is responsible for the high-level orchestration and scheduling of workflows and steps, workers are the essential components that execute the tasks on the ground. By deploying well-managed, efficient workers, you can ensure that your Hatchet-powered system is reliable, scalable, and capable of meeting your distributed task execution needs. Remember, a strong foundation of robust workers is key to harnessing the full capabilities of Hatchet.

Loading
0