Welcome back! In the previous chapter, we successfully set up our Trigger.dev project, getting ready to build powerful automated systems. Now, it’s time to dive into the fundamental building blocks that make Trigger.dev workflows so resilient and effective: Events, Tasks, and Retries. These three concepts are the bedrock for creating robust, automated workflows and AI agents that gracefully handle the complexities and inevitable failures of real-world production environments.

This chapter will guide you through understanding what events are, how tasks execute reliably, and how Trigger.dev automatically handles failures through intelligent retries. By the end, you’ll be able to create your first resilient workflow, capable of reacting to external signals and executing durable, fault-tolerant operations, boosting your confidence in building production-ready systems.

Understanding Trigger.dev’s Core: Events, Tasks, and Retries

Imagine you’re building a system where different parts need to communicate and perform actions. A user signs up, a file is uploaded, or an external API sends a notification. How do you reliably kick off a series of operations in response to these occurrences, especially when those operations might take a long time or encounter temporary network issues? This is precisely where Trigger.dev’s event-driven, durable execution model shines. It provides the primitives to build systems that are not just automated, but also fault-tolerant and scalable.

Events: The Spark of Your Workflow

At the heart of every Trigger.dev workflow is an Event. Think of an event as a notification or a signal that something significant has just happened in your system or an external service. It’s the “spark” that kicks off your automated process.

  • What is an Event? An event is a simple, structured JSON payload describing an occurrence. It’s a lightweight, immutable record of something that happened at a specific point in time, like user.signed.up or order.shipped.
  • Why do Events matter? Events enable highly decoupled systems. Instead of tightly coupling components by directly calling functions, parts of your system can simply emit an event. Any workflow interested in that event can then react to it, without needing to know who or what generated it. This decoupling is crucial for building scalable, flexible, and easily maintainable architectures, especially as your system grows and integrates with more services.
    • ⚡ Real-world insight: In production, events are often used to bridge microservices or integrate with third-party webhooks (e.g., Stripe payments, GitHub pushes).
  • How do Events work in Trigger.dev? You define a specific event name (e.g., user.signed.up, invoice.processed). When this event is sent to Trigger.dev (either via its SDK, API, or dashboard), any workflow configured to listen for that event will be triggered. Trigger.dev acts as the central nervous system, routing these events to the correct workflows.

Tasks: The Durable Workhorses

Once an event triggers a workflow, the actual work is performed by Tasks. A task is a single, self-contained unit of work that Trigger.dev executes reliably.

  • What is a Task? In Trigger.dev, a task is essentially an asynchronous function that you define within your workflow using io.runTask(). It represents a specific operation, like sending an email, calling an external API, performing a complex calculation, or interacting with an AI model.
  • Why do Tasks matter? Tasks are the core of Trigger.dev’s durable execution model. This means that once a task starts, Trigger.dev ensures it either completes successfully or, in case of failure, retries it until it succeeds (or exhausts its retry attempts). This is crucial for long-running operations or those that interact with external, potentially unreliable services.
    • 🧠 Important: Your code doesn’t need to worry about saving its state or re-running from scratch if the server crashes or the network blips. Trigger.dev handles that for you by checkpointing the workflow’s progress. This means your workflow can literally pause, be moved to another server, and resume exactly where it left off, making it incredibly robust.
  • How do Tasks work? You define a task using await io.runTask('your-task-id', async (payload, step) => { ... });. The io object provides the context for running tasks and interacting with Trigger.dev’s features, while the step context provides powerful capabilities for advanced workflows, which we’ll explore later. For now, think of io.runTask as the reliable wrapper for your business logic.

Retries: Embracing Failure Gracefully

In distributed systems, failures are not exceptions; they are an expected part of daily operations. Network glitches, API rate limits, or temporary service outages can all cause an operation to fail. Instead of crashing, your workflows need to be resilient and self-healing. This is where Retries come in.

  • What are Retries? Retries are the automatic re-execution of a failed task. Trigger.dev automatically retries tasks that encounter transient errors, giving them another chance to succeed without any manual intervention.
  • Why do Retries matter? Retries are fundamental to building robust systems. They allow your workflows to recover from temporary issues, significantly increasing reliability and reducing operational burden. Without retries, a single momentary network blip could halt an entire critical workflow, requiring costly manual restarts.
    • ⚠️ What can go wrong: While powerful, retries must be used thoughtfully. If a task is not idempotent (meaning running it multiple times with the same input has the same effect as running it once), you need to design your task carefully to avoid unintended side effects during retries (e.g., sending the same email multiple times).
  • How do Retries work? Trigger.dev has built-in, intelligent retry mechanisms. By default, it will retry tasks with an exponential backoff strategy, meaning it waits longer between each subsequent attempt, reducing load on the failing service. You can also configure the maximum number of attempts (maxAttempts) and the retry delay (retryDelayInMs) for specific tasks to fine-tune this behavior.

Let’s visualize how these core components interact to form a resilient workflow:

flowchart TD A[External System] -->|Sends Event| C[Triggerdev Cloud] C -->|Triggers Workflow| E[Workflow Execution] E --> F[Execute Task] F --> G{Task Succeeded} G -->|Yes| H[Workflow Continues] G -->|No Transient Error| I[Retry Logic] I -->|Retry Task| F I -->|Max Retries Reached| J[Workflow Failed]

This diagram illustrates the flow: an External System sends an event, which Trigger.dev Cloud receives. This Triggers Workflow Execution, leading to Execute Task. If the Task Succeeded? is no due to a transient error, Retry Logic kicks in, potentially retrying the task with backoff. If Max Retries Reached, the Workflow Failed, allowing for alerts. If the task succeeds, the Next Task or Workflow Continues.

Your First Resilient Workflow: A Step-by-Step Build

Let’s put these concepts into practice. We’ll create a simple workflow that listens for a custom event, performs a “processing” task, and intentionally demonstrates how retries work to recover from a temporary failure.

First, ensure your Trigger.dev development server is running from Chapter 2. If not, navigate to your project directory and run:

npm run dev

This command starts your local Trigger.dev development server and the background process that watches for changes in your src/jobs directory.

Step 1: Create Your Workflow File

Inside your src/jobs directory, create a new file named my-first-workflow.ts. This is where we’ll define our event listener and tasks.

Now, let’s add the basic structure of a Trigger.dev job to src/jobs/my-first-workflow.ts:

// src/jobs/my-first-workflow.ts
import { client } from "../trigger";

client.defineJob({
  id: "my-first-workflow",
  name: "My First Workflow",
  version: "1.0.0",
  // We'll define the trigger and run function here shortly
});
  • import { client } from "../trigger";: This line imports the pre-configured Trigger.dev client instance that we set up in Chapter 2. This client is your gateway to defining jobs, sending events, and interacting with the Trigger.dev platform.
  • client.defineJob({...});: This is the main function you use to define a workflow (or “job” in Trigger.dev terminology).
    • id: A unique, machine-readable identifier for your workflow. Good practice is to use kebab-case.
    • name: A human-readable name that will appear in the Trigger.dev dashboard.
    • version: A semantic version for your workflow, useful for tracking changes over time.

Next, let’s add the trigger and run properties to our job definition. The trigger specifies when the workflow should run, and run specifies what the workflow should do.

// src/jobs/my-first-workflow.ts
import { client } from "../trigger";

client.defineJob({
  id: "my-first-workflow",
  name: "My First Workflow",
  version: "1.0.0",
  // This is where we define what event triggers this workflow
  trigger: client.on("my.event", {
    schema: {
      type: "object",
      properties: {
        message: { type: "string" },
        shouldFail: { type: "boolean" },
      },
      required: ["message", "shouldFail"],
      additionalProperties: false,
    },
  }),
  // This is the core function that runs when the event is received
  run: async (payload, io, ctx) => {
    // We'll add our tasks here in the next step
  },
});

Let’s break down these additions:

  • trigger: client.on("my.event", { schema: ... }): This tells Trigger.dev to listen for an event named my.event.
    • client.on("my.event", ...): This is how you subscribe your workflow to a specific event.
    • schema: This object defines the expected structure of the incoming event payload using JSON Schema. This is excellent for early validation and type safety. Here, we’re expecting a message (string) and shouldFail (boolean). If an incoming event doesn’t match this schema, Trigger.dev will reject it, preventing unexpected errors in your workflow.
  • run: async (payload, io, ctx) => { ... };: This is the heart of your workflow, an asynchronous function that executes when my.event is received.
    • payload: This argument contains the data from the my.event that triggered this workflow. Its type is automatically inferred from your schema.
    • io: This is the I/O client, providing methods for performing operations like logging (io.logger), running durable tasks (io.runTask), and interacting with other Trigger.dev features.
    • ctx: This is the context object, providing information about the current execution, including details about retries (ctx.attempt), environment, and more.

Finally, let’s add the actual tasks inside the run function, demonstrating the core concepts of tasks and retries.

// src/jobs/my-first-workflow.ts
import { client } from "../trigger";

client.defineJob({
  id: "my-first-workflow",
  name: "My First Workflow",
  version: "1.0.0",
  trigger: client.on("my.event", {
    schema: {
      type: "object",
      properties: {
        message: { type: "string" },
        shouldFail: { type: "boolean" },
      },
      required: ["message", "shouldFail"],
      additionalProperties: false,
    },
  }),
  run: async (payload, io, ctx) => {
    // Log the incoming event payload for observability
    io.logger.info("Received 'my.event' payload", payload);

    // Step 1: Simulate a data processing task that might fail temporarily
    await io.runTask(
      "process-incoming-data", // Unique ID for this specific task within the workflow
      async (taskPayload) => {
        io.logger.info(`Processing data: ${taskPayload.message}`);

        // Introduce an intentional, temporary failure for demonstration
        // This task will fail if `shouldFail` is true AND it's the first attempt (attempt.number is 1).
        // It will succeed on the second attempt, demonstrating retry recovery.
        if (taskPayload.shouldFail && ctx.attempt.number === 1) {
          io.logger.warn(`Simulating a temporary failure for attempt ${ctx.attempt.number}...`);
          throw new Error("Simulated transient error during processing!");
        }

        io.logger.info("Data processed successfully!");
        return { status: "processed", originalMessage: taskPayload.message };
      },
      // The payload for this specific task (can be different from the event payload)
      payload,
      // Configure retries for this task
      {
        maxAttempts: 3, // Try up to 3 times
        retryDelayInMs: 1000, // Wait 1 second between retries
      }
    );

    // Step 2: Simulate another task that depends on the first one
    // This task will only run if the previous 'process-incoming-data' task succeeds (potentially after retries).
    await io.runTask("send-confirmation", async (taskPayload) => {
      io.logger.info(`Sending confirmation for: ${taskPayload.originalMessage}`);
      // In a real application, you might send an email, update a database, notify another service, etc.
      return { confirmationSent: true };
    });

    io.logger.info("Workflow completed successfully!");
  },
});

Here’s a detailed explanation of the new code within the run function:

  • io.logger.info(...): This is Trigger.dev’s structured logger. Using io.logger ensures your logs are captured by Trigger.dev and visible in the dashboard, making debugging much easier than relying solely on console.log.
  • await io.runTask(...): This is how you define and execute a durable task.
    • "process-incoming-data": This is a unique identifier for this specific task within your workflow. This ID is crucial for Trigger.dev to track the task’s state, enable retries, and ensure durable execution.
    • async (taskPayload) => { ... }: This is the function containing the actual logic for your task. It receives a taskPayload (which we’re passing the main event payload to) and can perform any asynchronous operations.
    • payload: This is the data we’re passing to our process-incoming-data task. In this case, we’re simply forwarding the entire payload that triggered the workflow.
    • { maxAttempts: 3, retryDelayInMs: 1000 }: This is the retry configuration for this specific task.
      • maxAttempts: 3: Tells Trigger.dev to try this task up to 3 times if it fails.
      • retryDelayInMs: 1000: Specifies a 1-second delay between retry attempts. Trigger.dev often applies exponential backoff by default, even if you specify a fixed delay, to prevent overwhelming a failing service.
    • if (taskPayload.shouldFail && ctx.attempt.number === 1): This line is a clever way to simulate a transient failure.
      • taskPayload.shouldFail: This condition comes directly from the event payload.
      • ctx.attempt.number === 1: The ctx.attempt.number tells you which attempt the current task execution is. It starts at 1 for the first attempt. By checking === 1, we ensure the task only fails on its very first execution. On subsequent retries (attempt 2, 3, etc.), this condition will be false, and the task will succeed, demonstrating how retries gracefully handle temporary issues.
  • await io.runTask("send-confirmation", ...): This defines a second task. Notice that this task is placed after the first io.runTask. Due to Trigger.dev’s durable execution, this send-confirmation task will only start once process-incoming-data has successfully completed (which might involve one or more retries). This sequential execution is guaranteed.

Step 2: Trigger Your Workflow

With your Trigger.dev development server running (via npm run dev), it automatically detects changes to src/jobs and registers your new workflow. You can now trigger it!

There are a few ways to send an event to Trigger.dev, but for local development, the dashboard is often the easiest.

  1. Using the Trigger.dev Dashboard (Recommended for local dev):

    • Open your browser to the Trigger.dev dashboard, usually at http://localhost:8080.
    • Navigate to the “Events” section on the left sidebar.
    • Click the “Send Event” button (usually in the top right).
    • For the “Event Name”, type my.event (this must exactly match the client.on("my.event", ...) you defined).
    • For the “Payload (JSON)”, enter the following JSON. This payload includes shouldFail: true to demonstrate the retry mechanism.
      {
        "message": "Hello from Trigger.dev!",
        "shouldFail": true
      }
      
    • Click “Send Event”.
  2. Using curl (for quick API testing): Open a new terminal window and run the following command. Remember to replace <YOUR_DEV_API_KEY> with the TRIGGER_API_KEY found in your .env file from Chapter 2.

    curl -X POST http://localhost:8080/api/v1/events \
         -H "Content-Type: application/json" \
         -H "Authorization: Bearer <YOUR_DEV_API_KEY>" \
         -d '{ "name": "my.event", "payload": { "message": "Hello via curl!", "shouldFail": true } }'
    

Step 3: Observe Execution and Retries

After sending the event, switch back to your Trigger.dev dashboard (http://localhost:8080).

  • Go to the “Runs” section on the left sidebar. You should see a new run entry for “My First Workflow”.
  • Click on the run ID to see its detailed execution timeline.
  • You’ll observe the process-incoming-data task initially failing (its status will briefly show “Retrying”), then pausing for 1 second, and finally succeeding on the second attempt.
  • After process-incoming-data successfully completes, the send-confirmation task will then execute successfully.
  • Check your development server’s console output (where npm run dev is running). You’ll see the io.logger.info messages and critically, the io.logger.warn message during the simulated failure, followed by the successful log on the retry.

This hands-on experience clearly demonstrates:

  • How an event (my.event) kicks off a workflow.
  • How tasks (process-incoming-data, send-confirmation) encapsulate durable units of work.
  • How Trigger.dev automatically retries failed tasks, making your workflow resilient to transient errors.

Mini-Challenge: Enhancing Your Workflow

Now it’s your turn to make a small modification! This challenge will reinforce your understanding of event payloads and conditional logic within tasks.

Challenge: Modify the my-first-workflow.ts file.

  1. Add a new property to the my.event schema called priority (type string, can be “high” or “low”). Make it a required property.
  2. Update the run function to check the priority from the incoming payload.
  3. If priority is “high”, make the process-incoming-data task simulate a failure twice (meaning it should succeed on the third attempt).
  4. If priority is “low”, the process-incoming-data task should never fail, regardless of the shouldFail flag.

Hint:

  • Remember to update the schema in client.on() to include priority.
  • Inside io.runTask, you can use taskPayload.priority and ctx.attempt.number to control the simulated failure logic.
  • Test with new payloads from the dashboard:
    • High priority, fails twice: {"message": "High priority data", "shouldFail": true, "priority": "high"}
    • Low priority, never fails: {"message": "Low priority data", "shouldFail": true, "priority": "low"}
    • Low priority, never fails (even if shouldFail is true): {"message": "Another low priority", "shouldFail": true, "priority": "low"}

What to observe/learn: This challenge reinforces how to use event payload data to dynamically alter workflow behavior and further demonstrates the power of ctx.attempt.number in managing retry logic for different scenarios. You’ll also practice modifying your event schema and observing its impact on workflow execution.

Common Pitfalls & Troubleshooting Basic Workflows

Even with simple workflows, a few common issues can arise. Knowing how to spot and fix them will save you significant time and frustration.

  • Mismatched Event Payloads:

    • Pitfall: You send an event payload that doesn’t match the schema defined in client.on(). This can lead to validation errors, prevent your workflow from triggering, or result in undefined values in your payload within the run function.
    • Troubleshooting:
      • Always check the “Events” tab in the Trigger.dev dashboard for validation errors. The error message will often tell you precisely which property is missing, has the wrong type, or has an unexpected format.
      • Adjust your event schema definition in client.on() or modify the payload you are sending to match.
      • 🧠 Important: Leverage TypeScript’s type inference. When you define your schema, Trigger.dev’s SDK will automatically provide accurate types for your payload in the run function. This catches many potential issues at compile time before your code even runs.
  • Infinite Retries (or Too Few):

    • Pitfall: A task might get stuck in an infinite retry loop if maxAttempts is set too high for a persistent error (e.g., an incorrect API key that will never work), or conversely, a task might fail permanently too quickly if maxAttempts is too low for a truly transient error.
    • Troubleshooting:
      • Identify Error Type: Determine if the error is truly transient (e.g., network timeout, rate limit) or persistent (e.g., invalid credentials, logic bug). For persistent errors, retries won’t help; they require a code change or configuration fix.
      • Review Configuration: Check the maxAttempts and retryDelayInMs configurations for your io.runTask calls. Tune these based on the expected nature of the external service.
      • Dashboard Insights: Use the run details in the Trigger.dev dashboard to see the retry count and the specific error messages from each attempt. This often reveals if the error is changing (transient) or staying the same (persistent).
    • 🔥 Optimization / Pro tip: For critical tasks interacting with external APIs, consider implementing a circuit breaker pattern in addition to retries. This can prevent your system from repeatedly hammering a failing external service, giving it time to recover.
  • Local Development Quirks:

    • Pitfall: Your workflow isn’t being triggered, or recent code changes aren’t reflected in your running application.
    • Troubleshooting:
      • npm run dev Status: Ensure your npm run dev process is still running in your terminal. If it crashed, restart it.
      • Terminal Logs: Check the terminal where npm run dev is running for any errors or messages about job registration. Trigger.dev should log when it successfully registers your my-first-workflow.ts file.
      • Environment Variables: Verify that your TRIGGER_API_KEY and TRIGGER_PUBLIC_KEY in your .env file are correct and haven’t been accidentally changed. These are essential for your local client to connect to the Trigger.dev dev server.
      • Restart: Sometimes, simply stopping (Ctrl+C) and restarting npm run dev can resolve issues with file change detection or stale configurations.

Summary: Your Workflow Foundation

Congratulations! You’ve just built and observed your first resilient workflow with Trigger.dev. We covered three crucial concepts that form the very foundation of event-driven, durable systems:

  • Events act as the triggers, initiating your workflows based on external or internal signals, enabling loose coupling.
  • Tasks are the durable units of work, executed reliably by Trigger.dev, ensuring your operations complete even through transient failures by preserving state and resuming execution.
  • Retries provide built-in fault tolerance, automatically re-attempting failed tasks with intelligent backoff to recover from temporary issues, significantly improving system reliability.

Understanding these fundamentals is key to building any automation or AI agent with Trigger.dev. You now have a solid foundation for creating event-driven, robust applications that can withstand the challenges of production environments.

In the next chapter, we’ll explore more advanced workflow patterns, including how to schedule tasks for future execution and manage dependencies between different steps. Get ready to add another layer of sophistication to your Trigger.dev skills!

References

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.