Hacker News
- DBRX: A new open LLM https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm 343 comments
- DBRX: 135B MoE LLM trained and open sourced by Databricks https://www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms 2 comments
- LLM Training and Inference with Intel Gaudi 2 AI Accelerators https://www.databricks.com/blog/llm-training-and-inference-intel-gaudi2-ai-accelerators 8 comments
- Databricks acquires serverless Postgres vendor bit.io https://www.databricks.com/blog/welcoming-bit-io-databricks-investing-developer-experience 117 comments
- Free Dolly: First truly open instruction-tuned LLM https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm 49 comments
- Hello Dolly: Democratizing the magic of ChatGPT with open models https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html 173 comments
- Databricks Announces General Availability of Model Serving https://www.databricks.com/blog/2023/03/07/announcing-general-availability-databricks-model-serving.html 5 comments
- Speeding up LXC container pull by up to 3x https://databricks.com/blog/2022/01/26/creating-a-faster-tar-extractor.html 22 comments
- Scala at scale at Databricks https://databricks.com/blog/2021/12/03/scala-at-scale-at-databricks.html 166 comments
- Databricks Raises $1B Series G Investment at $28B Valuation https://databricks.com/company/newsroom/press-releases/databricks-raises-1-billion-series-g-investment-at-28-billion-valuation 3 comments
- Uncovering performance regressions in the TCP SACKs vulnerability fixes https://databricks.com/blog/2019/09/16/adventures-in-the-tcp-stack-performance-regressions-vulnerability-fixes.html 3 comments
- Network performance regressions from TCP SACK vulnerability fixes https://databricks.com/blog/2019/08/01/network-performance-regressions-from-tcp-sack-vulnerability-fixes.html 5 comments
- Writing a Faster Jsonnet Compiler (2018) https://databricks.com/blog/2018/10/12/writing-a-faster-jsonnet-compiler.html 8 comments
- MLflow: An Open Source Machine Learning Platform https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html 28 comments
- Spark Summit Is Becoming the Spark and AI Summit https://databricks.com/blog/2017/12/06/spark-summit-is-becoming-the-spark-ai-summit.html 2 comments
- Cost-Based Optimizer in Apache Spark 2.2 https://databricks.com/blog/2017/08/31/cost-based-optimizer-in-apache-spark-2-2.html 5 comments
- Benchmarking Big Data SQL Platforms in the Cloud https://databricks.com/blog/2017/07/12/benchmarking-big-data-sql-platforms-in-the-cloud.html 2 comments
- Debugging a failing test case caused by query running “too fast” https://databricks.com/blog/2017/02/16/processing-trillion-rows-per-second-single-machine-can-nested-loop-joins-fast.html 7 comments
- Databricks Community Edition Is Now Generally Available https://databricks.com/blog/2016/06/07/dce-ga.html 2 comments
- Spark as a Compiler: Joining a Billion Rows per Second on a Laptop https://databricks.com/blog/2016/05/23/apache-spark-as-a-compiler-joining-a-billion-rows-per-second-on-a-laptop.html 53 comments
- Spark 2.0 Technical Preview https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html 36 comments
- The Unreasonable Effectiveness of Deep Learning on Spark https://databricks.com/blog/2016/04/01/unreasonable-effectiveness-of-deep-learning.html 14 comments
- Announcing Spark 1.6 https://databricks.com/blog/2016/01/04/announcing-spark-1-6.html 21 comments
- Project Tungsten: Bringing Spark Closer to Bare Metal https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html 11 comments
- Introducing DataFrames in Spark for Large Scale Data Science http://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html 41 comments
- Spark Breaks Previous Large-Scale Sort Record http://databricks.com/blog/2014/10/10/spark-breaks-previous-large-scale-sort-record.html 56 comments
- Databrics introduces the World's First Truly Open Instruction-Tuned LLM https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm 4 comments programming
- Databricks announces General Availability of Model Serving https://www.databricks.com/blog/2023/03/07/announcing-general-availability-databricks-model-serving.html 4 comments machinelearningnews
- Does trunk-based development still work for mlops and data science / AI heavy teams? https://www.databricks.com/explore/data-science-machine-learning/big-book-of-MLOps#page=1 7 comments devops
- Databricks/AWS Cluster Pricing - GPU vs Memory optimized cluster https://databricks.com/product/aws-pricing/instance-types 2 comments aws
- Pyspark now provides a native Pandas API https://databricks.com/blog/2021/10/04/pandas-api-on-upcoming-apache-spark-3-2.html 50 comments python
- Scala at Scale at Databricks https://databricks.com/blog/2021/12/03/scala-at-scale-at-databricks.html 4 comments programming
- Scala at Scale at Databricks https://databricks.com/blog/2021/12/03/scala-at-scale-at-databricks.html 4 comments scala
- First ever virtual Spark+AI summit. Access to sessions, keynotes, and virtual events is free for the first time too https://databricks.com/sparkaisummit/north-america-2020 2 comments scala
- Fast Parallel Testing at Databricks with Bazel https://databricks.com/blog/2019/07/23/fast-parallel-testing-at-databricks-with-bazel.html 3 comments scala
- Speedy Scala Builds with Bazel at Databricks [Li Haoyi and Ahir Reddy] https://databricks.com/blog/2019/02/27/speedy-scala-builds-with-bazel-at-databricks.html 12 comments scala
- Benefits of distributed processing on a cloud platform https://databricks.com/ 3 comments datascience
- Introducing Click: The Command Line Interactive Controller for Kubernetes https://databricks.com/blog/2018/03/27/introducing-click-the-command-line-interactive-controller-for-kubernetes.html 14 comments rust
- Are Spark's "Encoders" fast because they are macro based serializers? https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html 7 comments scala
- The next release of Apache Spark (1.4) will include support for R, supporting data frames & scaling across many cores/machines. http://databricks.com/blog/2015/06/09/announcing-sparkr-r-on-spark.html 3 comments rstats
Linking pages
- Membership | BSA | The Software Alliance http://www.bsa.org/about-bsa/bsa-members 171 comments
- GitHub - tobymao/sqlglot: Python SQL Parser and Transpiler https://github.com/tobymao/sqlglot 135 comments
- Databricks Is an RDBMS | Blog | Fivetran https://fivetran.com/blog/databricks-is-an-rdbms 89 comments
- dolly/data at master · databrickslabs/dolly · GitHub https://github.com/databrickslabs/dolly/tree/master/data 89 comments
- GitHub - ipyflow/ipyflow: A reactive Python kernel for Jupyter notebooks https://github.com/ipyflow/ipyflow 73 comments
- Replit - How to train your own Large Language Models https://blog.replit.com/llm-training 60 comments
- Databricks open-sources Delta Lake to make data lakes more reliable | TechCrunch https://techcrunch.com/2019/04/24/databricks-open-sources-delta-lake-to-make-data-lakes-more-reliable/ 54 comments
- GitHub - devinpleuler/analytics-handbook: Getting started with soccer analytics https://github.com/devinpleuler/analytics-handbook 43 comments
- GitHub - deepset-ai/haystack: :mag: Haystack is an open source NLP framework to interact with your data using Transformer models and LLMs (GPT-4, ChatGPT and alike). Haystack offers production-ready tools to quickly build complex question answering, semantic search, text generation applications, and more. https://github.com/deepset-ai/haystack 35 comments
- How companies make millions on Open Source – Palark | Blog https://blog.palark.com/open-source-business-models/ 33 comments
- Building an open data pipeline in 2024 - by Dan Goldin https://blog.twingdata.com/p/building-an-open-data-pipeline-in 32 comments
- GitHub - graphistry/pygraphistry: PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer https://github.com/graphistry/pygraphistry 27 comments
- Databricks acquires Redash, a visualizations service for data scientists | TechCrunch https://techcrunch.com/2020/06/24/databricks-acquires-redash-a-visualizations-service-for-data-scientists/ 26 comments
- How open-source software took over the world | TechCrunch https://techcrunch.com/2019/01/12/how-open-source-software-took-over-the-world/ 18 comments
- Membership | BSA | The Software Alliance https://bsa.org/membership 12 comments
- Startups That Will Be Huge in 2016 http://www.businessinsider.com/startups-that-will-be-huge-in-2016-2015-12 8 comments
- High-performance Inferencing with Transformer Models on Spark | by Dannie Sim | Towards Data Science https://towardsdatascience.com/high-performance-inferencing-with-large-transformer-models-on-spark-beb82e71ecc9 8 comments
- Polyaxon, Argo and Seldon for model training, package and deployment in Kubernetes https://danielfrg.com/blog/2018/10/model-management-polyaxon-argo-seldon/ 7 comments
- Meet VC Jeremy Fiance, UC Berkeley's 24-year-old superconnector • TechCrunch http://techcrunch.com/2016/04/18/meet-vc-jeremy-fiance-uc-berkeleys-24-year-old-superconnector/ 6 comments
- Dr Alex Ioannides – Building a Data Science Platform for R&D, Part 1 - Setting-Up AWS https://alexioannides.com/2016/08/16/building-a-data-science-platform-for-rd-part-1-setting-up-aws/ 5 comments