D4L  ·  By Deasil Works, Inc.  ·  6 U.S. facilities  ·  Bare metal · K8s

Open-source-based private cloud. Operated by us. Owned by you.

D4L is a private cloud built entirely on permissively-licensed open source: Apache NiFi, Trino, Postgres, Cassandra, OpenSearch, Kafka, Iceberg, Ceph, Superset, DataHub, plus the broader catalog further down. We operate it on bare metal Deasil owns across 6 U.S. facilities. The same operational ease you bought Snowflake or Tableau for, at a fraction of the price, with no per-seat tax and no SaaS vendor between you and your data. The pipelines, dashboards, and SQL belong to you. Moving D4L off our iron is a configuration change, not a migration project.

~⅓
the cost of equivalent SaaS
$0
per-seat, per-query, per-GB fees
1996
operating OSS data systems since
D4L  ·  Platform components
18 services · Sample Deployment
Apache NiFi · ETL REST · 8443
Trino · FED. QUERY JDBC · 8080
Apache Superset · BI HTTPS · 443
JupyterHub · NOTEBOOKS HTTPS · 443
PostgreSQL · RDBMS SQL · 5432
Apache Cassandra · KV CQL · 9042
OpenSearch · OLAP REST · 9200
Apache Hive · WAREHOUSE JDBC · 10000
Ceph Object Gateway · S3 S3 · 7480
DataHub · CATALOG GMS · 8080
Keycloak · IAM / SSO OIDC · 8443
Apache Kafka · EVENTS KAFKA · 9092
Apache Airflow · WORKFLOWS HTTPS · 8080
OpenSearch Dashboards · DASHBOARDS HTTPS · 5601
Prometheus · METRICS HTTPS · 9090
Apache Iceberg · TABLES CATALOG · REST
pgvector · VECTORS SQL · 5432
Kubernetes · ORCH API · 6443
02 · The Illusion

Black-box vendors sell stability. They deliver opacity.

Closed-source data products market on the promise of stability. What you actually buy is a vendor that hides its failures, deprecations, support backlog, and acquisition risk behind a status page. Deasil has been on the other side of those tickets for 25 years. We know what the box hides because we run the equivalent OSS stack ourselves, in the open, on-call.

The OSS projects in the canonical D4L stack run in production at companies whose engineering scale dwarfs any SaaS vendor's. Postgres runs Bloomberg, Reddit, Apple, and the largest Atlassian Jira tenant. Apache Cassandra runs Apple iCloud, Netflix, and Discord. Trino is what Netflix and Bloomberg query Iceberg with. OpenSearch is what AWS itself runs. Apache Kafka is what LinkedIn, Walmart, and Uber move events on. The notion that these projects need a SaaS vendor wrapped around them to be "production ready" is upside down: they are more battle-tested than the proprietary layers built on top of them.

What the SaaS box actually delivers is opacity. When Snowflake has an incident you read a status page. When Tableau deprecates a feature you read a release note. When Salesforce raises prices 9% you read a press release. When Looker is acquired by Google you read a blog post. None of these are inherent to the data system. They are inherent to the relationship.

And the features SaaS markets are usually the surface. Apache Superset ships more chart types and more native database connectors than Tableau. Trino federates more sources than Snowflake natively. OpenSearch's k-NN matches Elastic Cloud at typical RAG scale. NiFi's visual provenance is richer than Fivetran's lineage. Once the operations are taken care of, the OSS stack is usually the more capable one, not the compromise.

  • "Postgres can't run my workload."
    It runs Bloomberg's, Reddit's, Apple's, and the largest Atlassian Jira tenant. Sub-millisecond p99 reads are a tuning problem, not a technology problem. We tune it.
  • "If something breaks, who do I call?"
    Us. Same on-call rotation a SaaS vendor would have, with the engineers who actually fixed it last time, not a tier-one ticket queue.
  • "The vendor handles upgrades for me."
    They handle upgrades on their schedule, at their breakage tolerance. We handle yours on yours, with the rollback the vendor wouldn't give you.
  • "What if Deasil goes out of business?"
    The platform is OSS on hardware you can walk in and visit. Worst case you take it with you. That isn't true of your Snowflake account.
03 · Translations

What you pay for, and what we'd run instead.

The cost of "convenience" is line-by-line visible. For each commercial product on the left, the OSS project D4L would run on the right, with the portability and pricing trade you get when you switch.

If you pay for D4L runs What changes
Tableau
Salesforce
Creator $75 · Explorer $42 · Viewer $15 per user / month + 9% list-price hike Aug 2023, +6% Aug 2025
Apache Superset
Apache 2.0

No per-seat tax, no Salesforce shareholder. Dashboards are JSON in your git, not artifacts in a vendor cloud.

Snowflake
Snowflake Inc.
Storage $23–$40 / TB / month + per-credit compute. Mid-size enterprise bills routinely $10k–$50k+ / month.
Trino + Apache Iceberg + Hive on Ceph
Apache 2.0 / LGPL

Same SQL surface, federated across Postgres / Cassandra / OpenSearch. Iceberg is the format Snowflake itself now reads.

Databricks
Databricks Inc.
DBU credits + underlying cloud-instance bill, double-billed. Typically 20–40% above Snowflake on pure SQL.
Apache Spark + JupyterHub + MLflow on K8s
Apache 2.0 / BSD

The components Databricks repackages, run directly. Notebooks live in JupyterHub. Models live in MLflow. Pipelines live in Airflow.

Fivetran
Fivetran Inc.
$500 / M-MAR (Standard) up to $1,067 / M-MAR (Business Critical), and as of 2025 billed per connector, not per account.
Apache NiFi
Apache 2.0

Visual pipelines with provenance, role-based access, and lineage. The flow files are XML you keep. Moving NiFi off D4L is moving a config.

Datadog / New Relic
Datadog Inc. / New Relic Inc.
Per-host + per-metric + per-log-GB. A real example: bills exceed engineering salaries once you get past one team.
Prometheus + OpenSearch + OpenSearch Dashboards
Apache 2.0

A fully Apache-2.0 observability stack: metrics, logs, dashboards. No per-host meter and no AGPL network-copyleft surprise (which is why Grafana and Loki are not in the canonical D4L pick).

AWS S3
Amazon Web Services
$0.023 / GB / month storage + $0.09 / GB egress. 37signals saved $10M over 5 years exiting it.
Ceph or SeaweedFS on owned disk
LGPL 2.1 / Apache 2.0

The S3 API, not the S3 bill. We mount disks once. Egress is a network wire, not a billable event.

Confluent Cloud
Confluent Inc.
Per-throughput, per-partition, per-connector. The Confluent platform components are no longer Apache 2.0.
Apache Kafka or Redpanda
Apache 2.0 / BSL (Redpanda)

Kafka itself is still Apache. We can swap to Redpanda for the same wire protocol with less ops surface.

Elastic Cloud
Elastic N.V.
Per-node monthly plus data-transfer. The Elasticsearch license changed in 2021. AWS forked it the same week.
OpenSearch
Apache 2.0

Same Lucene under the hood. By 2024 OpenSearch has its own governance, foundation, and release cadence.

Auth0 / Okta
Okta Inc.
Per-MAU pricing tiers that step-function as you grow. Enterprise SSO is gated behind a separate contract.
Keycloak
Apache 2.0

OIDC, OAuth 2.0, SAML 2.0, 2FA. Federate with whatever IdP your org already runs. No MAU meter.

Pinecone / Weaviate Cloud
Pinecone Systems / Weaviate B.V.
Per-vector + per-pod + per-throughput. The category is six vendors deep and getting more crowded, not less.
pgvector / Qdrant / Milvus
PostgreSQL / Apache 2.0

pgvector is a column type on the database you already run. No second cluster, no separate billing rail.

Astronomer / AWS MWAA
Astronomer Inc. / Amazon Web Services
Per-environment + per-worker for what is otherwise a free Apache project.
Apache Airflow on K8s
Apache 2.0

The same DAGs, the same operators, the same UI. The bill is the cluster, not the project.

HashiCorp Cloud (Terraform, Vault)
HashiCorp / IBM
Per-resource-month for state, per-secret-month for Vault. The OSS license moved to BSL in 2023.
OpenTofu + OpenBao
MPL 2.0

The Linux Foundation forks of Terraform and Vault. Same HCL, same APIs, no BSL. The drop-in is real.

If you're paying for one of these, send us your last invoice. We'll come back within a week with a hardware spec, an OSS replacement plan, and a fixed monthly number.

04 · Stack

Bring your stack. Or pick from ours. Either way we operate it.

D4L runs any modern, libertine-licensed OSS data application your team has standardized on, configured to your spec, on iron we own, billed at one fixed monthly number. The 18 components below are the canonical D4L stack: a Sample Deployment we ship when the customer does not have a strong preference. The Explorer that follows lets you filter the broader universe of permissively-licensed OSS data tools D4L will operate on request.

Apache NiFi

ETL · Data Pipelines

Visual, role-based pipelines with built-in lineage. Real-time and scheduled acquisition, transformation, and routing across heterogeneous systems.

RESTSiteToSiteProvenance Apache 2.0

Trino

Distributed Query Engine

One SQL surface across PostgreSQL, Ceph (S3), OpenSearch, Cassandra, and Hive. Federated queries at petabyte scale without copying data.

JDBCRESTSQL Apache 2.0

Apache Superset

BI · Visualization

A web-native, open-source replacement for Tableau and Looker. Charts, dashboards, geospatial. Backed by Trino and PostgreSQL.

HTTPSSSO Apache 2.0

JupyterHub

Notebooks · DS · ML

Per-user JupyterLab environments with Python 3, R, Julia, Octave, Bash kernels and the standard data-science stack pre-installed.

HTTPSSSO BSD-3

PostgreSQL

RDBMS · Warehouse

Twenty-plus years of community development. Full SQL, JSONB, extensions. The reliable spine of nearly every D4L deployment.

SQLJSONBPG_REST PostgreSQL

Apache Cassandra

Wide-column · KV

Linear scalability, fault-tolerance proven on commodity hardware. The wide-column store under mission-critical write paths.

CQLgRPC Apache 2.0

OpenSearch

Search · OLAP · Logs

Apache 2.0 search and analytics for application search, log analytics, and observability. Lucene under the hood. No licensing trapdoors.

RESTDSLSQL Apache 2.0

Apache Hive

Warehouse · Metastore

SQL over distributed storage with a schema-on-read mindset. The catalog and metastore Trino reads to do its job.

JDBCHMS Apache 2.0

Ceph Object Gateway

S3-compatible Data Lake

Petabyte-scale, S3-API object storage for structured and unstructured data. Five-terabyte object cap; horizontally scalable to near-limitless capacity.

S3SwiftRGW LGPL 2.1

DataHub

Catalog · Lineage · Discovery

The metadata platform across the D4L stack. Asset discovery, column-level lineage, ownership, glossary, and data-product modelling for everything we run (NiFi, Trino, Postgres, OpenSearch, dbt, Airflow). Open-source, originally built at Netflix, Apache 2.0 throughout.

RESTGraphQLKafka Apache 2.0

Keycloak

IAM · SSO · OIDC

OpenID Connect, OAuth 2.0, SAML 2.0 with 2FA. One identity surface across the platform. Federate with your existing IdP if you have one.

OIDCSAMLOAuth2 Apache 2.0

Apache Kafka

Event Streaming · Log

Durable event log behind everything that needs an audit trail or a real-time pipeline. Confluent Cloud bills per throughput and per partition; Apache Kafka itself does not.

KafkaConnectStreams Apache 2.0

Apache Airflow

Workflow Orchestration

Python-defined DAGs for ETL and scheduled jobs. Astronomer and AWS MWAA charge per-environment and per-worker for what is otherwise a free Apache project.

HTTPSgRPCOIDC Apache 2.0

OpenSearch Dashboards

Dashboards · Observability UI

The dashboard layer over OpenSearch and Prometheus. The Apache-2.0 fork of Kibana, kept permissive when Elastic moved Kibana to SSPL in 2021. Replaces Grafana in the canonical D4L stack because Grafana itself moved to AGPL the same year.

HTTPSOIDCPromQL Apache 2.0

Prometheus

Metrics · Time-series

Pull-based metrics collection and alerting. The standard the entire CNCF ecosystem speaks. Cloud-managed equivalents charge per-metric per-month; the protocol is the same.

HTTPSPromQLOTLP Apache 2.0

Apache Iceberg

Open Table Format

The open table format that won. Default for AWS Athena, Glue, and EMR by 2024. Used here as the lakehouse table layer over Ceph so Trino, Spark, and Flink read the same bytes.

RESTJDBCS3 Apache 2.0

pgvector

Vector Search · RAG

PostgreSQL extension for vector similarity. Replaces Pinecone-class managed vector DBs with a column type on the database you already trust. No per-vector pricing, no separate cluster to operate.

SQLIVFFLATHNSW PostgreSQL

Kubernetes

Orchestration · Networking

The substrate everything else runs on. Deployments, scaling, secrets, networking, storage classes. Your platform with grown-up controls.

CRICNICSI Apache 2.0
05 · Explorer

137 permissively-licensed OSS data tools. Filter, sort, pick.

Curated from the GitHub universe at >5,000 stars and a permissive license (Apache 2.0, MIT, BSD, MPL 2.0, LGPL 2.1, ISC, PostgreSQL). No SSPL, BUSL, RSAL, or AGPL. D4L will configure and operate any of them, on the same iron, under the same fixed monthly bill. Bring this list to a vendor and ask them to do the same.

137 of 137 projects
06 · Architectures

The same parts, twelve different platforms.

The canonical components compose into very different production systems. Each diagram below is a real combination D4L customers run today. Swipe or scroll horizontally; click a number above to jump.

07 · Footprint

Six facilities. One private fabric.

Three core sites and three secondary facilities, interconnected on private fibre and federated under one Kubernetes control plane. Egress is wiring, not a billable event.

D4L  ·  U.S. data center footprint
Private fibre · Kubernetes federation
LOS ANGELES · CA Pacific peering · submarine cable landings PASADENA · CA LAS VEGAS · NV PHOENIX · AZ DALLAS / FORT WORTH · TX Central hub · low-latency to both coasts WASHINGTON · D.C. Ashburn carrier-hotel access · east-coast ingress
Core facility Edge facility Private trunk
08 · Commitments

Three constraints. Self-imposed.

The rules D4L holds itself to so the customer doesn't have to carry the risk. Each one is the inverse of a way SaaS vendors have hurt their customers in the last decade.

C/01

We pick licenses you can leave with.

Apache 2.0, BSD, MIT, MPL 2.0, LGPL 2.1, PostgreSQL. Never SSPL, BUSL, RSAL, or AGPL. The license is what determines whether the platform you walk away with is yours to operate elsewhere. We constrain ourselves to permissive and weak-copyleft so portability is a property of the stack, not a promise from the vendor.

C/02

We do not profit from your usage.

Fixed monthly or annual billing against owned hardware. No per-seat licensing. No per-query metering. No per-GB egress. A backfill is free. The bill in March is the bill in December. We over-provision so you do not pay a tax for being efficient.

C/03

We do not invent protocols.

Every D4L surface is reachable through an industry-standard protocol: S3, JDBC, REST, OIDC, CQL, PromQL. There is no proprietary client to install. The day you decide to leave us, the platform you walk away with is recognisable to anyone who has read the Apache documentation.

09 · Disclosure

We sell operations. The software belongs to its authors.

A fair question we have heard more than once: isn't D4L just reselling free software? An honest accounting of what you are paying for, who actually wrote the code, and where the credit (and the money) belongs.

The honest answer is no. D4L charges for operational labour and the iron the software runs on. The OSS projects themselves are free, and they remain free for any customer who wants to take them off our hardware and run them somewhere else. What we sell is the on-call engineer at 4 a.m., the Kubernetes upgrade pathway, the Ceph rebalance under load, the lease at the data centre, the disks themselves, and 25 years of running these systems in production. Not the bits on disk.

Compare that to most enterprise SaaS. The vast majority of commercial data products either fully repackage open source — Confluent is Kafka, AWS RDS is Postgres / MySQL / MariaDB, Elastic Cloud is Elasticsearch, MongoDB Atlas is MongoDB, Datadog runs on a FOSS stack — or use OSS for major components, with little or no credit to the upstream projects on the marketing site. Nearly all commercial software is built on, with, or against open source: from compilers to kernels to TLS stacks to format parsers. The exception is the rare pure-proprietary green-field, and even that ships in a Linux container.

D4L does not re-brand, hide, or obfuscate the OSS we run. Every component is named on this page — NiFi, Trino, Postgres, Cassandra, OpenSearch, Kafka, Iceberg, DataHub, Keycloak, Kubernetes, plus the 130+ projects in the Explorer above. Every license is shown. Every upstream is one click away.

If you are profiting from the heavy use of any of these projects, donate to them. OSS thrives on three things: contributors, popularity, and money. D4L provides the second by name on every customer engagement. The third is yours.

Send us your last Snowflake, Tableau, or Datadog invoice. We will reply with an OSS replacement plan and a fixed monthly number.