Welcome back! In the previous chapter, we successfully set up our Trigger.dev project, getting ready to build powerful automated systems. Now, it’s time to dive into the fundamental building blocks that make Trigger.dev workflows so resilient and effective: Events, Tasks, and Retries. These three concepts are the bedrock for creating robust, automated workflows and AI agents that gracefully handle the complexities and inevitable failures of real-world production environments.
This chapter will guide you through understanding what events are, how tasks execute reliably, and how Trigger.dev automatically handles failures through intelligent retries. By the end, you’ll be able to create your first resilient workflow, capable of reacting to external signals and executing durable, fault-tolerant operations, boosting your confidence in building production-ready systems.
Understanding Trigger.dev’s Core: Events, Tasks, and Retries
Imagine you’re building a system where different parts need to communicate and perform actions. A user signs up, a file is uploaded, or an external API sends a notification. How do you reliably kick off a series of operations in response to these occurrences, especially when those operations might take a long time or encounter temporary network issues? This is precisely where Trigger.dev’s event-driven, durable execution model shines. It provides the primitives to build systems that are not just automated, but also fault-tolerant and scalable.
Events: The Spark of Your Workflow
At the heart of every Trigger.dev workflow is an Event. Think of an event as a notification or a signal that something significant has just happened in your system or an external service. It’s the “spark” that kicks off your automated process.
- What is an Event? An event is a simple, structured JSON payload describing an occurrence. It’s a lightweight, immutable record of something that happened at a specific point in time, like
user.signed.upororder.shipped. - Why do Events matter? Events enable highly decoupled systems. Instead of tightly coupling components by directly calling functions, parts of your system can simply emit an event. Any workflow interested in that event can then react to it, without needing to know who or what generated it. This decoupling is crucial for building scalable, flexible, and easily maintainable architectures, especially as your system grows and integrates with more services.
⚡ Real-world insight:In production, events are often used to bridge microservices or integrate with third-party webhooks (e.g., Stripe payments, GitHub pushes).
- How do Events work in Trigger.dev? You define a specific event name (e.g.,
user.signed.up,invoice.processed). When this event is sent to Trigger.dev (either via its SDK, API, or dashboard), any workflow configured to listen for that event will be triggered. Trigger.dev acts as the central nervous system, routing these events to the correct workflows.
Tasks: The Durable Workhorses
Once an event triggers a workflow, the actual work is performed by Tasks. A task is a single, self-contained unit of work that Trigger.dev executes reliably.
- What is a Task? In Trigger.dev, a task is essentially an asynchronous function that you define within your workflow using
io.runTask(). It represents a specific operation, like sending an email, calling an external API, performing a complex calculation, or interacting with an AI model. - Why do Tasks matter? Tasks are the core of Trigger.dev’s durable execution model. This means that once a task starts, Trigger.dev ensures it either completes successfully or, in case of failure, retries it until it succeeds (or exhausts its retry attempts). This is crucial for long-running operations or those that interact with external, potentially unreliable services.
🧠 Important:Your code doesn’t need to worry about saving its state or re-running from scratch if the server crashes or the network blips. Trigger.dev handles that for you by checkpointing the workflow’s progress. This means your workflow can literally pause, be moved to another server, and resume exactly where it left off, making it incredibly robust.
- How do Tasks work? You define a task using
await io.runTask('your-task-id', async (payload, step) => { ... });. Theioobject provides the context for running tasks and interacting with Trigger.dev’s features, while thestepcontext provides powerful capabilities for advanced workflows, which we’ll explore later. For now, think ofio.runTaskas the reliable wrapper for your business logic.
Retries: Embracing Failure Gracefully
In distributed systems, failures are not exceptions; they are an expected part of daily operations. Network glitches, API rate limits, or temporary service outages can all cause an operation to fail. Instead of crashing, your workflows need to be resilient and self-healing. This is where Retries come in.
- What are Retries? Retries are the automatic re-execution of a failed task. Trigger.dev automatically retries tasks that encounter transient errors, giving them another chance to succeed without any manual intervention.
- Why do Retries matter? Retries are fundamental to building robust systems. They allow your workflows to recover from temporary issues, significantly increasing reliability and reducing operational burden. Without retries, a single momentary network blip could halt an entire critical workflow, requiring costly manual restarts.
⚠️ What can go wrong:While powerful, retries must be used thoughtfully. If a task is not idempotent (meaning running it multiple times with the same input has the same effect as running it once), you need to design your task carefully to avoid unintended side effects during retries (e.g., sending the same email multiple times).
- How do Retries work? Trigger.dev has built-in, intelligent retry mechanisms. By default, it will retry tasks with an exponential backoff strategy, meaning it waits longer between each subsequent attempt, reducing load on the failing service. You can also configure the maximum number of attempts (
maxAttempts) and the retry delay (retryDelayInMs) for specific tasks to fine-tune this behavior.
Let’s visualize how these core components interact to form a resilient workflow:
This diagram illustrates the flow: an External System sends an event, which Trigger.dev Cloud receives. This Triggers Workflow Execution, leading to Execute Task. If the Task Succeeded? is no due to a transient error, Retry Logic kicks in, potentially retrying the task with backoff. If Max Retries Reached, the Workflow Failed, allowing for alerts. If the task succeeds, the Next Task or Workflow Continues.
Your First Resilient Workflow: A Step-by-Step Build
Let’s put these concepts into practice. We’ll create a simple workflow that listens for a custom event, performs a “processing” task, and intentionally demonstrates how retries work to recover from a temporary failure.
First, ensure your Trigger.dev development server is running from Chapter 2. If not, navigate to your project directory and run:
npm run dev
This command starts your local Trigger.dev development server and the background process that watches for changes in your src/jobs directory.
Step 1: Create Your Workflow File
Inside your src/jobs directory, create a new file named my-first-workflow.ts. This is where we’ll define our event listener and tasks.
Now, let’s add the basic structure of a Trigger.dev job to src/jobs/my-first-workflow.ts:
// src/jobs/my-first-workflow.ts
import { client } from "../trigger";
client.defineJob({
id: "my-first-workflow",
name: "My First Workflow",
version: "1.0.0",
// We'll define the trigger and run function here shortly
});
import { client } from "../trigger";: This line imports the pre-configured Trigger.dev client instance that we set up in Chapter 2. This client is your gateway to defining jobs, sending events, and interacting with the Trigger.dev platform.client.defineJob({...});: This is the main function you use to define a workflow (or “job” in Trigger.dev terminology).id: A unique, machine-readable identifier for your workflow. Good practice is to use kebab-case.name: A human-readable name that will appear in the Trigger.dev dashboard.version: A semantic version for your workflow, useful for tracking changes over time.
Next, let’s add the trigger and run properties to our job definition. The trigger specifies when the workflow should run, and run specifies what the workflow should do.
// src/jobs/my-first-workflow.ts
import { client } from "../trigger";
client.defineJob({
id: "my-first-workflow",
name: "My First Workflow",
version: "1.0.0",
// This is where we define what event triggers this workflow
trigger: client.on("my.event", {
schema: {
type: "object",
properties: {
message: { type: "string" },
shouldFail: { type: "boolean" },
},
required: ["message", "shouldFail"],
additionalProperties: false,
},
}),
// This is the core function that runs when the event is received
run: async (payload, io, ctx) => {
// We'll add our tasks here in the next step
},
});
Let’s break down these additions:
trigger: client.on("my.event", { schema: ... }): This tells Trigger.dev to listen for an event namedmy.event.client.on("my.event", ...): This is how you subscribe your workflow to a specific event.schema: This object defines the expected structure of the incoming event payload using JSON Schema. This is excellent for early validation and type safety. Here, we’re expecting amessage(string) andshouldFail(boolean). If an incoming event doesn’t match this schema, Trigger.dev will reject it, preventing unexpected errors in your workflow.
run: async (payload, io, ctx) => { ... };: This is the heart of your workflow, an asynchronous function that executes whenmy.eventis received.payload: This argument contains the data from themy.eventthat triggered this workflow. Its type is automatically inferred from yourschema.io: This is the I/O client, providing methods for performing operations like logging (io.logger), running durable tasks (io.runTask), and interacting with other Trigger.dev features.ctx: This is the context object, providing information about the current execution, including details about retries (ctx.attempt), environment, and more.
Finally, let’s add the actual tasks inside the run function, demonstrating the core concepts of tasks and retries.
// src/jobs/my-first-workflow.ts
import { client } from "../trigger";
client.defineJob({
id: "my-first-workflow",
name: "My First Workflow",
version: "1.0.0",
trigger: client.on("my.event", {
schema: {
type: "object",
properties: {
message: { type: "string" },
shouldFail: { type: "boolean" },
},
required: ["message", "shouldFail"],
additionalProperties: false,
},
}),
run: async (payload, io, ctx) => {
// Log the incoming event payload for observability
io.logger.info("Received 'my.event' payload", payload);
// Step 1: Simulate a data processing task that might fail temporarily
await io.runTask(
"process-incoming-data", // Unique ID for this specific task within the workflow
async (taskPayload) => {
io.logger.info(`Processing data: ${taskPayload.message}`);
// Introduce an intentional, temporary failure for demonstration
// This task will fail if `shouldFail` is true AND it's the first attempt (attempt.number is 1).
// It will succeed on the second attempt, demonstrating retry recovery.
if (taskPayload.shouldFail && ctx.attempt.number === 1) {
io.logger.warn(`Simulating a temporary failure for attempt ${ctx.attempt.number}...`);
throw new Error("Simulated transient error during processing!");
}
io.logger.info("Data processed successfully!");
return { status: "processed", originalMessage: taskPayload.message };
},
// The payload for this specific task (can be different from the event payload)
payload,
// Configure retries for this task
{
maxAttempts: 3, // Try up to 3 times
retryDelayInMs: 1000, // Wait 1 second between retries
}
);
// Step 2: Simulate another task that depends on the first one
// This task will only run if the previous 'process-incoming-data' task succeeds (potentially after retries).
await io.runTask("send-confirmation", async (taskPayload) => {
io.logger.info(`Sending confirmation for: ${taskPayload.originalMessage}`);
// In a real application, you might send an email, update a database, notify another service, etc.
return { confirmationSent: true };
});
io.logger.info("Workflow completed successfully!");
},
});
Here’s a detailed explanation of the new code within the run function:
io.logger.info(...): This is Trigger.dev’s structured logger. Usingio.loggerensures your logs are captured by Trigger.dev and visible in the dashboard, making debugging much easier than relying solely onconsole.log.await io.runTask(...): This is how you define and execute a durable task."process-incoming-data": This is a unique identifier for this specific task within your workflow. This ID is crucial for Trigger.dev to track the task’s state, enable retries, and ensure durable execution.async (taskPayload) => { ... }: This is the function containing the actual logic for your task. It receives ataskPayload(which we’re passing the main eventpayloadto) and can perform any asynchronous operations.payload: This is the data we’re passing to ourprocess-incoming-datatask. In this case, we’re simply forwarding the entirepayloadthat triggered the workflow.{ maxAttempts: 3, retryDelayInMs: 1000 }: This is the retry configuration for this specific task.maxAttempts: 3: Tells Trigger.dev to try this task up to 3 times if it fails.retryDelayInMs: 1000: Specifies a 1-second delay between retry attempts. Trigger.dev often applies exponential backoff by default, even if you specify a fixed delay, to prevent overwhelming a failing service.
if (taskPayload.shouldFail && ctx.attempt.number === 1): This line is a clever way to simulate a transient failure.taskPayload.shouldFail: This condition comes directly from the event payload.ctx.attempt.number === 1: Thectx.attempt.numbertells you which attempt the current task execution is. It starts at1for the first attempt. By checking=== 1, we ensure the task only fails on its very first execution. On subsequent retries (attempt 2, 3, etc.), this condition will be false, and the task will succeed, demonstrating how retries gracefully handle temporary issues.
await io.runTask("send-confirmation", ...): This defines a second task. Notice that this task is placed after the firstio.runTask. Due to Trigger.dev’s durable execution, thissend-confirmationtask will only start onceprocess-incoming-datahas successfully completed (which might involve one or more retries). This sequential execution is guaranteed.
Step 2: Trigger Your Workflow
With your Trigger.dev development server running (via npm run dev), it automatically detects changes to src/jobs and registers your new workflow. You can now trigger it!
There are a few ways to send an event to Trigger.dev, but for local development, the dashboard is often the easiest.
Using the Trigger.dev Dashboard (Recommended for local dev):
- Open your browser to the Trigger.dev dashboard, usually at
http://localhost:8080. - Navigate to the “Events” section on the left sidebar.
- Click the “Send Event” button (usually in the top right).
- For the “Event Name”, type
my.event(this must exactly match theclient.on("my.event", ...)you defined). - For the “Payload (JSON)”, enter the following JSON. This payload includes
shouldFail: trueto demonstrate the retry mechanism.{ "message": "Hello from Trigger.dev!", "shouldFail": true } - Click “Send Event”.
- Open your browser to the Trigger.dev dashboard, usually at
Using
curl(for quick API testing): Open a new terminal window and run the following command. Remember to replace<YOUR_DEV_API_KEY>with theTRIGGER_API_KEYfound in your.envfile from Chapter 2.curl -X POST http://localhost:8080/api/v1/events \ -H "Content-Type: application/json" \ -H "Authorization: Bearer <YOUR_DEV_API_KEY>" \ -d '{ "name": "my.event", "payload": { "message": "Hello via curl!", "shouldFail": true } }'
Step 3: Observe Execution and Retries
After sending the event, switch back to your Trigger.dev dashboard (http://localhost:8080).
- Go to the “Runs” section on the left sidebar. You should see a new run entry for “My First Workflow”.
- Click on the run ID to see its detailed execution timeline.
- You’ll observe the
process-incoming-datatask initially failing (its status will briefly show “Retrying”), then pausing for 1 second, and finally succeeding on the second attempt. - After
process-incoming-datasuccessfully completes, thesend-confirmationtask will then execute successfully. - Check your development server’s console output (where
npm run devis running). You’ll see theio.logger.infomessages and critically, theio.logger.warnmessage during the simulated failure, followed by the successful log on the retry.
This hands-on experience clearly demonstrates:
- How an
event(my.event) kicks off a workflow. - How
tasks(process-incoming-data,send-confirmation) encapsulate durable units of work. - How Trigger.dev automatically
retriesfailed tasks, making your workflow resilient to transient errors.
Mini-Challenge: Enhancing Your Workflow
Now it’s your turn to make a small modification! This challenge will reinforce your understanding of event payloads and conditional logic within tasks.
Challenge: Modify the my-first-workflow.ts file.
- Add a new property to the
my.eventschema calledpriority(typestring, can be “high” or “low”). Make it a required property. - Update the
runfunction to check thepriorityfrom the incomingpayload. - If
priorityis “high”, make theprocess-incoming-datatask simulate a failure twice (meaning it should succeed on the third attempt). - If
priorityis “low”, theprocess-incoming-datatask should never fail, regardless of theshouldFailflag.
Hint:
- Remember to update the
schemainclient.on()to includepriority. - Inside
io.runTask, you can usetaskPayload.priorityandctx.attempt.numberto control the simulated failure logic. - Test with new payloads from the dashboard:
- High priority, fails twice:
{"message": "High priority data", "shouldFail": true, "priority": "high"} - Low priority, never fails:
{"message": "Low priority data", "shouldFail": true, "priority": "low"} - Low priority, never fails (even if
shouldFailis true):{"message": "Another low priority", "shouldFail": true, "priority": "low"}
- High priority, fails twice:
What to observe/learn: This challenge reinforces how to use event payload data to dynamically alter workflow behavior and further demonstrates the power of ctx.attempt.number in managing retry logic for different scenarios. You’ll also practice modifying your event schema and observing its impact on workflow execution.
Common Pitfalls & Troubleshooting Basic Workflows
Even with simple workflows, a few common issues can arise. Knowing how to spot and fix them will save you significant time and frustration.
Mismatched Event Payloads:
- Pitfall: You send an event payload that doesn’t match the
schemadefined inclient.on(). This can lead to validation errors, prevent your workflow from triggering, or result inundefinedvalues in yourpayloadwithin therunfunction. - Troubleshooting:
- Always check the “Events” tab in the Trigger.dev dashboard for validation errors. The error message will often tell you precisely which property is missing, has the wrong type, or has an unexpected format.
- Adjust your event
schemadefinition inclient.on()or modify the payload you are sending to match. 🧠 Important:Leverage TypeScript’s type inference. When you define your schema, Trigger.dev’s SDK will automatically provide accurate types for yourpayloadin therunfunction. This catches many potential issues at compile time before your code even runs.
- Pitfall: You send an event payload that doesn’t match the
Infinite Retries (or Too Few):
- Pitfall: A task might get stuck in an infinite retry loop if
maxAttemptsis set too high for a persistent error (e.g., an incorrect API key that will never work), or conversely, a task might fail permanently too quickly ifmaxAttemptsis too low for a truly transient error. - Troubleshooting:
- Identify Error Type: Determine if the error is truly transient (e.g., network timeout, rate limit) or persistent (e.g., invalid credentials, logic bug). For persistent errors, retries won’t help; they require a code change or configuration fix.
- Review Configuration: Check the
maxAttemptsandretryDelayInMsconfigurations for yourio.runTaskcalls. Tune these based on the expected nature of the external service. - Dashboard Insights: Use the run details in the Trigger.dev dashboard to see the retry count and the specific error messages from each attempt. This often reveals if the error is changing (transient) or staying the same (persistent).
🔥 Optimization / Pro tip:For critical tasks interacting with external APIs, consider implementing a circuit breaker pattern in addition to retries. This can prevent your system from repeatedly hammering a failing external service, giving it time to recover.
- Pitfall: A task might get stuck in an infinite retry loop if
Local Development Quirks:
- Pitfall: Your workflow isn’t being triggered, or recent code changes aren’t reflected in your running application.
- Troubleshooting:
npm run devStatus: Ensure yournpm run devprocess is still running in your terminal. If it crashed, restart it.- Terminal Logs: Check the terminal where
npm run devis running for any errors or messages about job registration. Trigger.dev should log when it successfully registers yourmy-first-workflow.tsfile. - Environment Variables: Verify that your
TRIGGER_API_KEYandTRIGGER_PUBLIC_KEYin your.envfile are correct and haven’t been accidentally changed. These are essential for your local client to connect to the Trigger.dev dev server. - Restart: Sometimes, simply stopping (
Ctrl+C) and restartingnpm run devcan resolve issues with file change detection or stale configurations.
Summary: Your Workflow Foundation
Congratulations! You’ve just built and observed your first resilient workflow with Trigger.dev. We covered three crucial concepts that form the very foundation of event-driven, durable systems:
- Events act as the triggers, initiating your workflows based on external or internal signals, enabling loose coupling.
- Tasks are the durable units of work, executed reliably by Trigger.dev, ensuring your operations complete even through transient failures by preserving state and resuming execution.
- Retries provide built-in fault tolerance, automatically re-attempting failed tasks with intelligent backoff to recover from temporary issues, significantly improving system reliability.
Understanding these fundamentals is key to building any automation or AI agent with Trigger.dev. You now have a solid foundation for creating event-driven, robust applications that can withstand the challenges of production environments.
In the next chapter, we’ll explore more advanced workflow patterns, including how to schedule tasks for future execution and manage dependencies between different steps. Get ready to add another layer of sophistication to your Trigger.dev skills!
References
- Trigger.dev Documentation: Events
- Trigger.dev Documentation: Tasks
- Trigger.dev Documentation: Retries
- Trigger.dev GitHub Repository
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.