D4L^™ · By Deasil Works, Inc. · 6 U.S. facilities · Bare metal · K8s

Open-source-based private cloud. Operated by us. Owned by you.

D4L is a private cloud built entirely on permissively-licensed open source: Apache NiFi, Trino, Postgres, Cassandra, OpenSearch, Kafka, Iceberg, Ceph, Superset, DataHub, plus the broader catalog further down. We operate it on bare metal Deasil owns across 6 U.S. facilities. The same operational ease you bought Snowflake or Tableau for, at a fraction of the price, with no per-seat tax and no SaaS vendor between you and your data. The pipelines, dashboards, and SQL belong to you. Moving D4L off our iron is a configuration change, not a migration project.

See the translations ↓ Send us your last invoice

~⅓

the cost of equivalent SaaS

per-seat, per-query, per-GB fees

1996

operating OSS data systems since

D4L · Platform components

18 services · Sample Deployment

Apache NiFi · ETL REST · 8443

Trino · FED. QUERY JDBC · 8080

Apache Superset · BI HTTPS · 443

JupyterHub · NOTEBOOKS HTTPS · 443

PostgreSQL · RDBMS SQL · 5432

Apache Cassandra · KV CQL · 9042

OpenSearch · OLAP REST · 9200

Apache Hive · WAREHOUSE JDBC · 10000

Ceph Object Gateway · S3 S3 · 7480

DataHub · CATALOG GMS · 8080

Keycloak · IAM / SSO OIDC · 8443

Apache Kafka · EVENTS KAFKA · 9092

Apache Airflow · WORKFLOWS HTTPS · 8080

OpenSearch Dashboards · DASHBOARDS HTTPS · 5601

Prometheus · METRICS HTTPS · 9090

Apache Iceberg · TABLES CATALOG · REST

pgvector · VECTORS SQL · 5432

Kubernetes · ORCH API · 6443

02 · The Illusion

Black-box vendors sell stability. They deliver opacity.

Closed-source data products market on the promise of stability. What you actually buy is a vendor that hides its failures, deprecations, support backlog, and acquisition risk behind a status page. Deasil has been on the other side of those tickets for 25 years. We know what the box hides because we run the equivalent OSS stack ourselves, in the open, on-call.

The OSS projects in the canonical D4L stack run in production at companies whose engineering scale dwarfs any SaaS vendor's. Postgres runs Bloomberg, Reddit, Apple, and the largest Atlassian Jira tenant. Apache Cassandra runs Apple iCloud, Netflix, and Discord. Trino is what Netflix and Bloomberg query Iceberg with. OpenSearch is what AWS itself runs. Apache Kafka is what LinkedIn, Walmart, and Uber move events on. The notion that these projects need a SaaS vendor wrapped around them to be "production ready" is upside down: they are more battle-tested than the proprietary layers built on top of them.

What the SaaS box actually delivers is opacity. When Snowflake has an incident you read a status page. When Tableau deprecates a feature you read a release note. When Salesforce raises prices 9% you read a press release. When Looker is acquired by Google you read a blog post. None of these are inherent to the data system. They are inherent to the relationship.

And the features SaaS markets are usually the surface. Apache Superset ships more chart types and more native database connectors than Tableau. Trino federates more sources than Snowflake natively. OpenSearch's k-NN matches Elastic Cloud at typical RAG scale. NiFi's visual provenance is richer than Fivetran's lineage. Once the operations are taken care of, the OSS stack is usually the more capable one, not the compromise.

"Postgres can't run my workload."

It runs Bloomberg's, Reddit's, Apple's, and the largest Atlassian Jira tenant. Sub-millisecond p99 reads are a tuning problem, not a technology problem. We tune it.
"If something breaks, who do I call?"

Us. Same on-call rotation a SaaS vendor would have, with the engineers who actually fixed it last time, not a tier-one ticket queue.
"The vendor handles upgrades for me."

They handle upgrades on their schedule, at their breakage tolerance. We handle yours on yours, with the rollback the vendor wouldn't give you.
"What if Deasil goes out of business?"

The platform is OSS on hardware you can walk in and visit. Worst case you take it with you. That isn't true of your Snowflake account.

03 · Translations

What you pay for, and what we'd run instead.

The cost of "convenience" is line-by-line visible. For each commercial product on the left, the OSS project D4L would run on the right, with the portability and pricing trade you get when you switch.

If you pay for D4L runs What changes

Tableau

Salesforce

Creator $75 · Explorer $42 · Viewer $15 per user / month + 9% list-price hike Aug 2023, +6% Aug 2025

Apache Superset

Apache 2.0

No per-seat tax, no Salesforce shareholder. Dashboards are JSON in your git, not artifacts in a vendor cloud.

Snowflake

Snowflake Inc.

Storage $23–$40 / TB / month + per-credit compute. Mid-size enterprise bills routinely $10k–$50k+ / month.

Trino + Apache Iceberg + Hive on Ceph

Apache 2.0 / LGPL

Same SQL surface, federated across Postgres / Cassandra / OpenSearch. Iceberg is the format Snowflake itself now reads.

Databricks

Databricks Inc.

DBU credits + underlying cloud-instance bill, double-billed. Typically 20–40% above Snowflake on pure SQL.

Apache Spark + JupyterHub + MLflow on K8s

Apache 2.0 / BSD

The components Databricks repackages, run directly. Notebooks live in JupyterHub. Models live in MLflow. Pipelines live in Airflow.

Fivetran

Fivetran Inc.

$500 / M-MAR (Standard) up to $1,067 / M-MAR (Business Critical), and as of 2025 billed per connector, not per account.

Apache NiFi

Apache 2.0

Visual pipelines with provenance, role-based access, and lineage. The flow files are XML you keep. Moving NiFi off D4L is moving a config.

Datadog / New Relic

Datadog Inc. / New Relic Inc.

Per-host + per-metric + per-log-GB. A real example: bills exceed engineering salaries once you get past one team.

Prometheus + OpenSearch + OpenSearch Dashboards

Apache 2.0

A fully Apache-2.0 observability stack: metrics, logs, dashboards. No per-host meter and no AGPL network-copyleft surprise (which is why Grafana and Loki are not in the canonical D4L pick).

AWS S3

Amazon Web Services

$0.023 / GB / month storage + $0.09 / GB egress. 37signals saved $10M over 5 years exiting it.

Ceph or SeaweedFS on owned disk

LGPL 2.1 / Apache 2.0

The S3 API, not the S3 bill. We mount disks once. Egress is a network wire, not a billable event.

Confluent Cloud

Confluent Inc.

Per-throughput, per-partition, per-connector. The Confluent platform components are no longer Apache 2.0.

Apache Kafka or Redpanda

Apache 2.0 / BSL (Redpanda)

Kafka itself is still Apache. We can swap to Redpanda for the same wire protocol with less ops surface.

Elastic Cloud

Elastic N.V.

Per-node monthly plus data-transfer. The Elasticsearch license changed in 2021. AWS forked it the same week.

OpenSearch

Apache 2.0

Same Lucene under the hood. By 2024 OpenSearch has its own governance, foundation, and release cadence.

Auth0 / Okta

Okta Inc.

Per-MAU pricing tiers that step-function as you grow. Enterprise SSO is gated behind a separate contract.

Keycloak

Apache 2.0

OIDC, OAuth 2.0, SAML 2.0, 2FA. Federate with whatever IdP your org already runs. No MAU meter.

Pinecone / Weaviate Cloud

Pinecone Systems / Weaviate B.V.

Per-vector + per-pod + per-throughput. The category is six vendors deep and getting more crowded, not less.

pgvector / Qdrant / Milvus

PostgreSQL / Apache 2.0

pgvector is a column type on the database you already run. No second cluster, no separate billing rail.

Astronomer / AWS MWAA

Astronomer Inc. / Amazon Web Services

Per-environment + per-worker for what is otherwise a free Apache project.

Apache Airflow on K8s

Apache 2.0

The same DAGs, the same operators, the same UI. The bill is the cluster, not the project.

HashiCorp Cloud (Terraform, Vault)

HashiCorp / IBM

Per-resource-month for state, per-secret-month for Vault. The OSS license moved to BSL in 2023.

OpenTofu + OpenBao

MPL 2.0

The Linux Foundation forks of Terraform and Vault. Same HCL, same APIs, no BSL. The drop-in is real.

If you're paying for one of these, send us your last invoice. We'll come back within a week with a hardware spec, an OSS replacement plan, and a fixed monthly number.

04 · Stack

Bring your stack. Or pick from ours. Either way we operate it.

D4L runs any modern, libertine-licensed OSS data application your team has standardized on, configured to your spec, on iron we own, billed at one fixed monthly number. The 18 components below are the canonical D4L stack: a Sample Deployment we ship when the customer does not have a strong preference. The Explorer that follows lets you filter the broader universe of permissively-licensed OSS data tools D4L will operate on request.

Apache NiFi

ETL · Data Pipelines

Visual, role-based pipelines with built-in lineage. Real-time and scheduled acquisition, transformation, and routing across heterogeneous systems.

RESTSiteToSiteProvenance Apache 2.0

Trino

Distributed Query Engine

One SQL surface across PostgreSQL, Ceph (S3), OpenSearch, Cassandra, and Hive. Federated queries at petabyte scale without copying data.

JDBCRESTSQL Apache 2.0

Apache Superset

BI · Visualization

A web-native, open-source replacement for Tableau and Looker. Charts, dashboards, geospatial. Backed by Trino and PostgreSQL.

HTTPSSSO Apache 2.0

JupyterHub

Notebooks · DS · ML

Per-user JupyterLab environments with Python 3, R, Julia, Octave, Bash kernels and the standard data-science stack pre-installed.

HTTPSSSO BSD-3

PostgreSQL

RDBMS · Warehouse

Twenty-plus years of community development. Full SQL, JSONB, extensions. The reliable spine of nearly every D4L deployment.

SQLJSONBPG_REST PostgreSQL

Apache Cassandra

Wide-column · KV

Linear scalability, fault-tolerance proven on commodity hardware. The wide-column store under mission-critical write paths.

CQLgRPC Apache 2.0

OpenSearch

Search · OLAP · Logs

Apache 2.0 search and analytics for application search, log analytics, and observability. Lucene under the hood. No licensing trapdoors.

RESTDSLSQL Apache 2.0

Apache Hive

Warehouse · Metastore

SQL over distributed storage with a schema-on-read mindset. The catalog and metastore Trino reads to do its job.

JDBCHMS Apache 2.0

Ceph Object Gateway

S3-compatible Data Lake

Petabyte-scale, S3-API object storage for structured and unstructured data. Five-terabyte object cap; horizontally scalable to near-limitless capacity.

S3SwiftRGW LGPL 2.1

DataHub

Catalog · Lineage · Discovery

The metadata platform across the D4L stack. Asset discovery, column-level lineage, ownership, glossary, and data-product modelling for everything we run (NiFi, Trino, Postgres, OpenSearch, dbt, Airflow). Open-source, originally built at Netflix, Apache 2.0 throughout.

RESTGraphQLKafka Apache 2.0

Keycloak

IAM · SSO · OIDC

OpenID Connect, OAuth 2.0, SAML 2.0 with 2FA. One identity surface across the platform. Federate with your existing IdP if you have one.

OIDCSAMLOAuth2 Apache 2.0

Apache Kafka

Event Streaming · Log

Durable event log behind everything that needs an audit trail or a real-time pipeline. Confluent Cloud bills per throughput and per partition; Apache Kafka itself does not.

KafkaConnectStreams Apache 2.0

Apache Airflow

Workflow Orchestration

Python-defined DAGs for ETL and scheduled jobs. Astronomer and AWS MWAA charge per-environment and per-worker for what is otherwise a free Apache project.

HTTPSgRPCOIDC Apache 2.0

OpenSearch Dashboards

Dashboards · Observability UI

The dashboard layer over OpenSearch and Prometheus. The Apache-2.0 fork of Kibana, kept permissive when Elastic moved Kibana to SSPL in 2021. Replaces Grafana in the canonical D4L stack because Grafana itself moved to AGPL the same year.

HTTPSOIDCPromQL Apache 2.0

Prometheus

Metrics · Time-series

Pull-based metrics collection and alerting. The standard the entire CNCF ecosystem speaks. Cloud-managed equivalents charge per-metric per-month; the protocol is the same.

HTTPSPromQLOTLP Apache 2.0

Apache Iceberg

Open Table Format

The open table format that won. Default for AWS Athena, Glue, and EMR by 2024. Used here as the lakehouse table layer over Ceph so Trino, Spark, and Flink read the same bytes.

RESTJDBCS3 Apache 2.0

pgvector

Vector Search · RAG

PostgreSQL extension for vector similarity. Replaces Pinecone-class managed vector DBs with a column type on the database you already trust. No per-vector pricing, no separate cluster to operate.

SQLIVFFLATHNSW PostgreSQL

Kubernetes

Orchestration · Networking

The substrate everything else runs on. Deployments, scaling, secrets, networking, storage classes. Your platform with grown-up controls.

CRICNICSI Apache 2.0

06 · Architectures

The same parts, twelve different platforms.

The canonical components compose into very different production systems. Each diagram below is a real combination D4L customers run today. Swipe or scroll horizontally; click a number above to jump.

01 / 12

Real-time analytics

Sub-second analytics on live event streams. Snowflake replaced by Druid; Looker replaced by Superset.

flowchart LR
    A([App · IoT · Logs]) --> B[Apache NiFi]
    B --> C[Apache Kafka]
    C --> D[Apache Flink]
    D --> E[(Apache Druid)]
    E --> F[Apache Superset]

02 / 12

Lakehouse for ML

Iceberg tables on Ceph, queried by Trino, fed into JupyterHub notebooks with MLflow tracking.

flowchart LR
    A([Sources]) --> B[Apache NiFi]
    B --> C[(Ceph + Iceberg)]
    C --> D[Trino]
    D --> E[JupyterHub]
    E --> F[MLflow]

03 / 12

CDC into open lakehouse

Change-data-capture from production Postgres into an Iceberg-on-Ceph warehouse, queried with Trino.

flowchart LR
    A[(PostgreSQL)] --> B[Debezium]
    B --> C[Apache Kafka]
    C --> D[Trino]
    D --> E[(Ceph + Iceberg)]
    E --> F[Apache Superset]

04 / 12

Vector RAG platform

pgvector for embeddings. JupyterHub for the application surface. vLLM for inference. No Pinecone.

flowchart LR
    A([Documents]) --> B[Apache NiFi]
    B --> C[(pgvector)]
    C --> D[Trino]
    D --> E[JupyterHub]
    E --> F[vLLM]

05 / 12

Permissive observability stack

OpenTelemetry into Prometheus + OpenSearch; visualised by OpenSearch Dashboards. Datadog replaced.

flowchart LR
    A([Applications]) --> B[OpenTelemetry]
    B --> C[(Prometheus)]
    B --> D[(OpenSearch)]
    C --> E[OpenSearch Dashboards]
    D --> E

06 / 12

IoT telemetry pipeline

MQTT into NiFi into Cassandra for hot writes; Trino + Superset for analyst-facing dashboards.

flowchart LR
    A([MQTT Devices]) --> B[Apache NiFi]
    B --> C[(Apache Cassandra)]
    C --> D[Trino]
    D --> E[Apache Superset]

07 / 12

Catalog and lineage layer

DataHub indexes everything we run. OpenLineage emits events from NiFi, dbt, Airflow into Marquez.

flowchart LR
    A[Apache NiFi] --> D[DataHub]
    B[dbt] --> D
    C[Trino] --> D
    D --> E[OpenLineage]
    E --> F[Marquez]

08 / 12

Streaming with quality gates

Great Expectations gates the stream before it lands in the lakehouse. Bad rows quarantined, not silently absorbed.

flowchart LR
    A([Sources]) --> B[Debezium]
    B --> C[Apache Kafka]
    C --> D[Apache Flink]
    D --> E[Great Expectations]
    E --> F[(Ceph + Iceberg)]

09 / 12

Federated query layer

Trino as the single SQL surface across Postgres, Cassandra, OpenSearch, and Ceph. No data movement required.

flowchart TD
    A[(PostgreSQL)] --> T[Trino]
    B[(Apache Cassandra)] --> T
    C[(OpenSearch)] --> T
    D[(Ceph + Iceberg)] --> T
    T --> E[Apache Superset]
    T --> F[JupyterHub]

10 / 12

Data-science workbench

Per-user JupyterHub fronted by Keycloak SSO. Trino for SQL. pgvector for embeddings. MLflow tracks every run.

flowchart LR
    K[Keycloak SSO] -.->|OIDC| B[JupyterHub]
    A[Apache Airflow] --> B
    B --> C[Trino]
    B --> D[(pgvector)]
    B --> E[MLflow]

11 / 12

Modern BI without Tableau

NiFi loads Postgres. dbt models. Iceberg-on-Ceph stores. Trino queries. Superset and Lightdash both consume.

flowchart LR
    A([Sources]) --> B[Apache NiFi]
    B --> C[(PostgreSQL)]
    C --> D[dbt]
    D --> E[(Ceph + Iceberg)]
    E --> F[Trino]
    F --> G[Apache Superset]
    F --> H[Lightdash]

12 / 12

Identity-fronted multi-tenant

Keycloak fronts every UI in the platform. One identity, one audit trail, one place to revoke.

flowchart LR
    K[Keycloak SSO]
    K --> A[Apache Superset]
    K --> B[JupyterHub]
    K --> C[OpenSearch Dashboards]
    K --> D[Apache NiFi]
    K --> E[DataHub]

07 · Footprint

Six facilities. One private fabric.

Three core sites and three secondary facilities, interconnected on private fibre and federated under one Kubernetes control plane. Egress is wiring, not a billable event.

D4L · U.S. data center footprint

Private fibre · Kubernetes federation

Core facility Edge facility Private trunk

08 · Commitments

Three constraints. Self-imposed.

The rules D4L holds itself to so the customer doesn't have to carry the risk. Each one is the inverse of a way SaaS vendors have hurt their customers in the last decade.

C/01

We pick licenses you can leave with.

Apache 2.0, BSD, MIT, MPL 2.0, LGPL 2.1, PostgreSQL. Never SSPL, BUSL, RSAL, or AGPL. The license is what determines whether the platform you walk away with is yours to operate elsewhere. We constrain ourselves to permissive and weak-copyleft so portability is a property of the stack, not a promise from the vendor.

C/02

We do not profit from your usage.

Fixed monthly or annual billing against owned hardware. No per-seat licensing. No per-query metering. No per-GB egress. A backfill is free. The bill in March is the bill in December. We over-provision so you do not pay a tax for being efficient.

C/03

We do not invent protocols.

Every D4L surface is reachable through an industry-standard protocol: S3, JDBC, REST, OIDC, CQL, PromQL. There is no proprietary client to install. The day you decide to leave us, the platform you walk away with is recognisable to anyone who has read the Apache documentation.

09 · Disclosure

We sell operations. The software belongs to its authors.

A fair question we have heard more than once: isn't D4L just reselling free software? An honest accounting of what you are paying for, who actually wrote the code, and where the credit (and the money) belongs.

The honest answer is no. D4L charges for operational labour and the iron the software runs on. The OSS projects themselves are free, and they remain free for any customer who wants to take them off our hardware and run them somewhere else. What we sell is the on-call engineer at 4 a.m., the Kubernetes upgrade pathway, the Ceph rebalance under load, the lease at the data centre, the disks themselves, and 25 years of running these systems in production. Not the bits on disk.

Compare that to most enterprise SaaS. The vast majority of commercial data products either fully repackage open source — Confluent is Kafka, AWS RDS is Postgres / MySQL / MariaDB, Elastic Cloud is Elasticsearch, MongoDB Atlas is MongoDB, Datadog runs on a FOSS stack — or use OSS for major components, with little or no credit to the upstream projects on the marketing site. Nearly all commercial software is built on, with, or against open source: from compilers to kernels to TLS stacks to format parsers. The exception is the rare pure-proprietary green-field, and even that ships in a Linux container.

D4L does not re-brand, hide, or obfuscate the OSS we run. Every component is named on this page — NiFi, Trino, Postgres, Cassandra, OpenSearch, Kafka, Iceberg, DataHub, Keycloak, Kubernetes, plus the 130+ projects in the Explorer above. Every license is shown. Every upstream is one click away.

If you are profiting from the heavy use of any of these projects, donate to them. OSS thrives on three things: contributors, popularity, and money. D4L provides the second by name on every customer engagement. The third is yours.

Send us your last Snowflake, Tableau, or Datadog invoice. We will reply with an OSS replacement plan and a fixed monthly number.

info@deasil.works (818) 945-0821

Open-source-based private cloud. Operated by us. Owned by you.

Black-box vendors sell stability. They deliver opacity.

What you pay for, and what we'd run instead.

Bring your stack. Or pick from ours. Either way we operate it.

Apache NiFi

Trino

Apache Superset

JupyterHub

PostgreSQL

Apache Cassandra

OpenSearch

Apache Hive

Ceph Object Gateway

DataHub

Keycloak

Apache Kafka

Apache Airflow

OpenSearch Dashboards

Prometheus

Apache Iceberg

pgvector

Kubernetes

137 permissively-licensed OSS data tools. Filter, sort, pick.