Cohort Retention Engineer
Build cohort retention logic and churn views that survive product evolution and messy subscription edge cases.
Collection
Analytics models, pipelines, measurement rigor, and data quality under change.
19 skills in this lane
Build cohort retention logic and churn views that survive product evolution and messy subscription edge cases.
Implements enterprise data catalogs with DataHub or Amundsen for data discovery, governance, and collaboration
Implements comprehensive data contracts with schemas, SLAs, and quality guarantees between data producers and consumers
Implements column-level data lineage tracking across the entire data pipeline for impact analysis and debugging
Facilitates data mesh adoption with domain-oriented ownership, self-serve platforms, and federated governance
Implements comprehensive data pipeline monitoring, anomaly detection, and incident response for data reliability
Implements Great Expectations data quality framework with comprehensive validation, profiling, and automated quality gates
Instrument data pipelines with freshness, completeness, and anomaly detection checks that fail usefully.
Shape analytics data into warehouse models that remain queryable, explainable, and maintainable as the product grows.
Designs production-grade dbt data transformation pipelines with optimal model layering, testing, and documentation
Design trustworthy experimentation infrastructure with sound randomization, sizing, and interpretation defaults.
Designs production-grade feature stores with Feast or Tecton for ML feature management, serving, and monitoring
Design product event funnels that reveal meaningful drop-off and activation patterns instead of vanity charts.
Designs semantic metrics layers with dbt metrics or Transform for consistent business metric definitions across tools
Implements privacy-preserving data techniques including differential privacy, k-anonymity, and data masking for GDPR/CCPA compliance
Design analytics flows that preserve useful product insight while reducing privacy and re-identification risk.
Designs high-performance real-time analytics systems using ClickHouse, Druid, and Pinot for sub-second query latency
Build sub-second analytics pipelines over streaming events without turning the system into an operational mystery.
Builds complex stream processing pipelines using ksqlDB and Flink SQL with windowing, joins, and stateful operations