02-test Allgemein 02-test I am text block. Click edit button to change this text. Lorem ipsum dolor sit…KupferschmidtAdmin23. Dezember 2025
Big Data Engineering — Declarative Data Flows Allgemein Big Data Engineering — Declarative Data Flows This is part 3 of a series on data engineering in a big data environment.…KupferschmidtAdmin22. Oktober 2020
Big Data Engineering — Apache Spark Big DataPySparkSpark Big Data Engineering — Apache Spark This is part 2 of a series on data engineering in a big data environment.…KupferschmidtAdmin17. Oktober 2020
Big Data Engineering — Best Practices Big DataSpark Big Data Engineering — Best Practices This is part 1 of a series on data engineering in a big data environment.…KupferschmidtAdmin16. Oktober 2020
Running Jupyter with Spark in Docker Running Jupyter with Spark in Docker most attendees of dimajix Spark workshops seem to like the hands-on approach I am offering…KupferschmidtAdmin2. Oktober 2017
Jupyter Notebooks with PySpark in AWS Jupyter Notebooks with PySpark in AWS Amazon Elastic MapReduce (EMR) is something wonderful if you need compute capacity on demand. I…KupferschmidtAdmin22. Mai 2017
Running Spark and Hadoop with S3 Running Spark and Hadoop with S3 Traditionally HDFS was the primary storage for Hadoop (and therefore also for Apache Spark). Naturally…KupferschmidtAdmin5. Mai 2017
Running PySpark on Anaconda in PyCharm Running PySpark on Anaconda in PyCharm Working with PySpark Currently Apache Spark with its bindings PySpark and SparkR is the processing…KupferschmidtAdmin15. April 2017
Building Druid for Cloudera 5.4.x Building Druid for Cloudera 5.4.x So the other day I wanted to investigate into using Druid as a reporting backend…dominik_adm1n23. März 2016