Skip to content

Cloud Integration

PX relies on commodity cloud primitives that are available across all major cloud providers. The core building blocks are simple: cloud VMs and object storage.

Supported Cloud Providers

PX uses standard cloud infrastructure:

ProviderComputeStorage
Google CloudGCP Compute EngineGCS (Cloud Storage)
AmazonAWS EC2S3
MicrosoftAzure VMsAzure Blob Storage
DigitalOceanDropletsSpaces

During the private beta, GCP is supported. AWS, Azure, and DigitalOcean support is coming soon.

How Cloud Storage Works

PX mounts your object storage as a local filesystem on every VM. This means your code can access cloud storage using standard file operations — no SDK required.

Configuration Example

In your px.yaml file, you specify filesystem mounts:

yaml
/px-gcs-bucket:
  source: gs://px-gcs-bucket
  store: gcs

This tells PX that every VM should have a filesystem mount at /px-gcs-bucket sourced from the GCS bucket gs://px-gcs-bucket.

FUSE Mounts

PX uses FUSE (Filesystem in Userspace) to mount object storage:

  • GCP: gcsfuse
  • AWS: goofys (planned)
  • Azure: blobfuse2 (planned)
  • DigitalOcean: goofys via S3 API (planned)

Your code running on VMs is unaware of the blob storage details — all files are accessed via standard Linux filesystem operations.

Setting Up Your Data

To get your data into cloud storage, use standard command-line tools like rclone:

bash
rclone sync images/ gs://px-gcs-bucket/images/

rclone supports AWS, GCP, Azure natively, and DigitalOcean via the S3 backend (since DO Spaces is S3 API-compatible).

Cluster Configuration

Your px.yaml file specifies the minimum compute requirements:

yaml
cluster:
  nodes: 2
  cpus_per_node: 4

PX will provision the necessary cloud resources in your cloud account based on this specification.

Why This Approach Works

Modern object storage offers:

  • Access speeds comparable to high-performance NAS
  • Strong consistency and durability guarantees
  • Commodity pricing
  • Cross-region availability

Combined with FUSE mounts, this creates a simple programming model: write code that reads and writes files, and PX handles the distributed infrastructure.