Skip to content

About PX

What if the gap between "it works on my laptop" and "it's running on a freshly-built cloud cluster" was just a single command?

What if you could turn your working single-threaded and single-process code into a staging or production cluster -- with thousands of resilient processes -- within seconds?

What if you could focus on working code and let something else worry about cluser management and cluster runtime visibility?

What if you could do this all regardless of which cloud you run in (e.g. AWS, GCP, DigitalOcean, Azure) or which programming language you use (e.g. Python, Ruby, JavaScript, Go, Rust, Zig, Clojure, etc.)?

Wouldn't that make you feel in power and in control as a backend developer?

Enter px.

Run px cluster up to get your cluster

At the center of the PX philosophy is the px command-line interface (CLI).

With a simple configuration file, px.yaml, describing your basic server requirements (e.g. number of nodes, number of CPU cores, amount of disk, amount of memory), you just run:

bash
px cluster up CLUSTER_NAME

And you'll get a cloud cluster. So if we want a cluster for bulk processing JPG images, we could just name it "jpg" thusly:

bash
px cluster up jpg

This will automatically find you the cheapest available compute from your cloud provider that meets your server requirements. Within seconds, your cluster will be up and ready to run your code.

Now, just use px run to execute your code

The px job runner takes inspiration from the Linux tool GNU parallel, which allows you to take a working single-core program and run it across the multiple cores of your local machine with no code modifications. It does this by spawning multiple copies of your program, automatically partitioning the inputs, and automatically multiplexing the outputs.

The px CLI extends this simple parallelism approach to the cloud! It allows you to achieve multi-node parallelism and quick deployments of your working code: all in your cloud environment, on plain Linux VMs you fully control, with no code changes and no support tickets to an infrastructure team.

Let's say you have a program called jpg.py that converts JPG files to another file format. You would run it locally as follows:

bash
python jpg.py data/input.jpg

Now let's say you want to run that same program across hundreds of thousands of JPGs stored in the cloud, and you want to use the cluster you built above using px cluster up jpg to do it all in parallel. Simply run your Python program on that cluster thusly:

bash
px run --cluster jpg -a images.txt 'python jpg.py'

And you're done.

It'll automatically feed all the images listed in images.txt on each server to each process spawned across your cluster. The paths in images.txt are normal Linux filesystem paths, accessing a /images mountpoint that px has wired into your nodes from your cloud provider's cloud storage. This is also simply configured in px.yaml. Your program's file output will go to the same place.

Suddenly, your simple single-threaded and single-process JPG conversion Python program can be scaled to handle thousands or millions of JPG images across any number of cloud nodes that px provisions and manages for you.

And you can monitor the job output just as you would monitor a local process: via streaming output at your terminal.

But there's more magic to it than that. The goal of px isn't just to make parallelism in the cloud easy. The goal of px is to make the entire cloud development experience feel as simple, straightforward, and joyful as local development.

See the quick start guide for more details.

Visit https://px.app to see your code running

At the center of the PX debugging experience is the https://px.app domain, which gives you an instant web-based debugging dashboard for your cloud parallel jobs.

When you run the above:

bash
px run --cluster jpg -a images.txt 'python jpg.py'

If you're logged into the PX Dashboard, PX will also spit out a URL:

<px> Monitor your PX job at:
<px> https://px.app/dash/job/wX47eT

In other words, every running px job gets a corresponding https://px.app/ securely-provisioned and private job dashboard.

Within that dashboard, you can see key metrics about your job:

  • stdin, stdout, stderr
  • lines, bytes, and files used in I/O
  • CPU, disk, memory, and GPU performance
  • logs and stacktraces
  • cloud costs

What's more, you'll be able to jump right into a shell and see how your job is doing with familiar tools like htop.

And you'll be able to share live execution details about your job with your team.

Whereas you use git and https://github.com for managing your code at development time, you use px and https://px.app to manage your code at cloud runtime.

No longer is debugging a distributed cluster job harder than debugging the local version.

The audacious goal of px is to make it even easier than debugging the local version!

Just px it!

Are you tired of spending weeks getting your working code into production in the cloud? Are you sick of a million different debugging dashboards and log file locations?

Do you want to reclaim your open source expertise and bring back joy, transparency, and ease to your day-to-day development? Check out our quick start guide to give px a try.