Hacker News
- Apache Hudi 1.0 released with secondary indexes for data lakehouses https://hudi.apache.org/blog/2024/12/16/announcing-hudi-1-0-0/ 9 comments
Linking pages
- Databases in 2024: A Year in Review // Blog // Andy Pavlo - Carnegie Mellon University https://www.cs.cmu.edu/~pavlo/blog/2025/01/2024-databases-retrospective.html 295 comments
- Amazon’s Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | AWS Open Source Blog https://aws.amazon.com/blogs/opensource/amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-amazon-ec2/ 90 comments
- Trino | A decade of query engine innovation https://trino.io/blog/2022/08/04/decade-innovation.html 66 comments
- From Data Lake to Lakehouse | by Robert Kossendey | claimsforce https://medium.com/claimsforce/lakehouse-the-journey-unifying-data-lake-and-data-warehouse-bef7629c143a 63 comments
- Open Table Formats Are Inevitable For Analytical Datasets | Ensemble https://ensembleanalytics.io/blog/open-table-formats-inevitable 58 comments
- Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs https://www.infoq.com/news/2023/12/zendesk-dynamodb-mysql-s3-cost/ 50 comments
- Query Serving Systems http://petereliaskraft.net/blog/query-serving-systems 14 comments
- Hydrating a Data Lake using Log-based Change Data Capture (CDC) with Debezium, Apicurio, and Kafka Connect on AWS | by Gary A. Stafford | ITNEXT https://garystafford.medium.com/hydrating-a-data-lake-using-log-based-change-data-capture-cdc-with-debezium-apicurio-and-kafka-799671e0012f?amp%3Bsk=20eb3f53b3ed2a2366d6494317a1bed0&source=friends_link 7 comments
- Apache Iceberg Reduced Our Amazon S3 Cost by 90% | Insider Engineering https://medium.com/insiderengineering/apache-iceberg-reduced-our-amazon-s3-cost-by-90-997cde5ce931 5 comments
- GitHub - apache/hudi: Upserts, Deletes And Incremental Processing on Big Data. https://github.com/apache/hudi 5 comments
- Data Manageability: The revolution that is turning Data Trust into the New North Star Data Manageability: The revolution that is turning Data Trust into the New North Star https://lakefs.io/data-manageability/ 2 comments
- The State of Data Engineering 2023 https://lakefs.io/blog/the-state-of-data-engineering-2023/ 2 comments
- A First Look at S3 (Iceberg) Tables | Meltware https://meltware.com/2024/12/04/s3-tables.html 2 comments
- ML Infrastructure Doesn’t Have To Suck https://techblog.citystoragesystems.com/p/ml-infrastructure-doesnt-have-to 2 comments
- Announcing Icechunk! | Earthmover https://earthmover.io/blog/icechunk/ 1 comment
- Data Lake vs Data Warehouse - Blog | luminousmen https://luminousmen.com/post/data-lake-vs-data-warehouse 0 comments
- The Apache Software Foundation Announces Apache® Hudi™ as a Top-Level Project - The Apache Software Foundation Blog https://blogs.apache.org/foundation/entry/the-apache-software-foundation-announces64 0 comments
- MLOps with a Feature Store. If AI is to become embedded in the DNA… | by Jim Dowling | Towards Data Science https://towardsdatascience.com/mlops-with-a-feature-store-816cfa5966e9 0 comments
- With $8M seed, Onehouse builds open source data lakehouse, eyes managed service | TechCrunch https://techcrunch.com/2022/02/02/with-8m-seed-onehouse-builds-open-source-data-lake-house-eyes-managed-service/ 0 comments
- What I Learned From Tecton's apply() 2022 Conference — James Le https://jameskle.com/writes/tecton-apply2022 0 comments