Welcome back to our "Build & deploy" series! In a previous post, we created and deployed two pipelines and integrated them into a workflow in Apache Hop using the Hop GUI. Now, it’s time to take things up a notch by running those files from the command line using Hop Run.
In this post, we'll walk through how to run your pipelines and workflows for the sample project "my-hop-project" using Hop Run, instead of relying on the Hop GUI. This method is particularly useful for automating your processes, integrating with CI/CD pipelines, or running jobs in headless environments.
Why use Hop Run?
Hop Run is a command-line tool that allows you to execute workflows and pipelines without launching the Hop GUI. This can offer several advantages:
- Automation: Easily integrate into scripts or CI/CD pipelines.
- Performance: No need to open the GUI, saving memory and improving execution speed.
- Flexibility: You can run workflows on remote servers or headless machines.
Pre-requisites
Before we begin, ensure you have the following:
- Apache Hop installed: Whether you’re on Windows, macOS, or Linux, you should have a functioning installation of Apache Hop.
- Pipelines and workflows ready: In our earlier post, we created two pipelines and a workflow in the project my-hop-project. Make sure they are saved in your project directory.
Now, let’s go through the steps of running our pipelines and workflows using Hop Run.
Step 1: Locating the Hop Run script
Depending on your operating system, the Hop Run script is available in two different formats:
- Windows: hop-run.bat
- MacOS/Linux: hop-run.sh
These scripts can be found in the root directory of your Apache Hop installation. Navigate to this directory before proceeding with the following commands.
To navigate to the root directory of your Apache Hop installation, use the following commands based on your operating system:
Windows
cd C:\path\to\your\apache-hop-directory
MacOS/Linux
cd /path/to/your/apache-hop-directory
Replace C:\path\to\your\apache-hop-directory or /path/to/your/apache-hop-directory with the actual path where Apache Hop is installed on your system.
Step 2: Running a pipeline with Hop Run
Let’s start by running a pipeline we created in our my-hop-project, which is responsible for extracting and transforming the data.
Here's how you can do that from the command line:
Windows
hop-run.bat -j my-hop-project \
-f C:/path/to/my-hop-project/code/clean_transform.hpl \
-r local
MacOS/Linux
./hop-run.sh -j my-hop-project \
-f /path/to/my-hop-project/code/clean_transform.hpl \
-r local
Explanation:
- -j my-hop-project: Specifies the project within which the pipeline exists.
- -f <path>: Points to the .hpl file for the pipeline.
- -r local: Tells Hop Run to use the local run configuration.
After executing the command, Hop Run will start the pipeline, extract and transform your data, and output the results just like in the Hop GUI.
Step 3: Running a workflow with Hop Run
We can run the pipelines in a workflow execution. For example, let’s say we’ll run the workflow we created in the Build & deploy 2: Develop your first pipelines in a workflow.
Windows
hop-run.bat -j my-hop-project \
-f C:/path/to/my-hop-project/code/flights-processing.hwf \
-r local
MacOS/Linux
./hop-run.sh -j my-hop-project \
-f /path/to/my-hop-project/code/flights-processing.hwf \
-r local
Explanation:
- -j my-hop-project: Specifies the project within which the workflow exists.
- -f <path>: Specifies the path to the workflow file .hwf.
- -r local: Tells Hop Run to use the local run configuration.
The workflow will execute the two pipelines in sequence, ensuring that the data is first extracted and transformed before being loaded.
Just like with the pipelines, you’ll see logs for each step, including pipeline executions, and you’ll be informed if any errors occur.
Remarks
Handling errors and exit codes
When running pipelines and workflows via Hop Run, it’s important to handle errors appropriately. Hop Run provides exit codes to indicate success or failure:
- Exit Code 0: Everything worked flawlessly.
- Exit Code 1: An error occurred during execution.
- Exit Code 2: A general error was encountered.
- Exit Code 9: There was an error parsing the provided parameters.
You can use these exit codes to trigger notifications or automated responses in scripts or deployment pipelines.
Log level
When running pipelines or workflows using hop-run, it's important to specify the log level using the -l or --level option. You can choose from various levels, such as NOTHING, ERROR, MINIMAL, BASIC, DETAILED, DEBUG, or ROWLEVEL, depending on the amount of logging detail you need.
-l, --level=<level> The debug level, one of NOTHING, ERROR, MINIMAL,
BASIC, DETAILED, DEBUG, ROWLEVEL
For instance, setting it to ERROR will log only errors, while DEBUG or ROWLEVEL provides more granular information about the execution, which can be useful for troubleshooting or in-depth analysis.
Example Use Case - automating with a Shell script
To fully automate the process, you can create a shell script that runs the workflow, monitors the exit code, and sends a notification or logs the result. Here's an example for Linux/macOS:
bash
#!/bin/bash
# Run the workflow
./hop-run.sh -j my-hop-project
-f /path/to/my-hop-project/code/flights-processing.hwf -r local
# Check the exit code
if [ $? -eq 0 ]; then
echo "Workflow executed successfully!"
else
echo "Workflow execution failed. Check the logs for details."
fi
This script runs the workflow and checks the exit code to determine if it completed successfully. You could expand this with logging, email notifications, or integrations with CI/CD pipelines.
Conclusion
Running your Apache Hop pipelines and workflows via Hop Run opens up a world of automation and flexibility. Whether you're integrating with a CI/CD pipeline or running data processes on a remote server, Hop Run simplifies the execution of complex workflows without relying on the Hop GUI.
Don't miss the video below for a step-by-step walkthrough of the entire process!
Stay connected
If you have any questions or run into issues, contact us and we’ll be happy to help.
Build & deploy 6: Running Apache Hop pipelines and workflows using Hop Run