Cloud Integration
PX relies on commodity cloud primitives that are available across all major cloud providers. The core building blocks are simple: cloud VMs and object storage.
Supported Cloud Providers
PX uses standard cloud infrastructure:
| Provider | Compute | Storage |
|---|---|---|
| Google Cloud | GCP Compute Engine | GCS (Cloud Storage) |
| Amazon | AWS EC2 | S3 |
| Microsoft | Azure VMs | Azure Blob Storage |
| DigitalOcean | Droplets | Spaces |
During the private beta, GCP is supported. AWS, Azure, and DigitalOcean support is coming soon.
How Cloud Storage Works
PX mounts your object storage as a local filesystem on every VM. This means your code can access cloud storage using standard file operations — no SDK required.
Configuration Example
In your px.yaml file, you specify filesystem mounts:
yaml
/px-gcs-bucket:
source: gs://px-gcs-bucket
store: gcsThis tells PX that every VM should have a filesystem mount at /px-gcs-bucket sourced from the GCS bucket gs://px-gcs-bucket.
FUSE Mounts
PX uses FUSE (Filesystem in Userspace) to mount object storage:
- GCP: gcsfuse
- AWS: goofys (planned)
- Azure: blobfuse2 (planned)
- DigitalOcean: goofys via S3 API (planned)
Your code running on VMs is unaware of the blob storage details — all files are accessed via standard Linux filesystem operations.
Setting Up Your Data
To get your data into cloud storage, use standard command-line tools like rclone:
bash
rclone sync images/ gs://px-gcs-bucket/images/rclone supports AWS, GCP, Azure natively, and DigitalOcean via the S3 backend (since DO Spaces is S3 API-compatible).
Cluster Configuration
Your px.yaml file specifies the minimum compute requirements:
yaml
cluster:
nodes: 2
cpus_per_node: 4PX will provision the necessary cloud resources in your cloud account based on this specification.
Why This Approach Works
Modern object storage offers:
- Access speeds comparable to high-performance NAS
- Strong consistency and durability guarantees
- Commodity pricing
- Cross-region availability
Combined with FUSE mounts, this creates a simple programming model: write code that reads and writes files, and PX handles the distributed infrastructure.