Repository containing Demos and Labs for Apache Spark and ML with Databricks.
- Demos/ — instructor-led notebooks with
<TODO>placeholders for live coding - Labs/ — student exercise notebooks with
<TODO>placeholders - Soluciones/ — completed reference solutions for all demos and labs
- Datasets/ — data files used across notebooks
- Docker/ — local Spark + JupyterLab environment
Notebooks are designed to run sequentially. Demo 2 (Data Cleansing) generates the cleaned Airbnb dataset that all subsequent notebooks (3–9) depend on.
The output of Demo 2 (Data Cleansing) , and input for the following notebooks is already available in Datasets/output/airbnb/ so you can jump straight to any later notebook.
cd Docker
docker-compose up --buildJupyterLab will be available at http://localhost:8888 (no token or password required).
The following folders are mounted into the container and available under /home/jovyan/work/:
| Local folder | Container path |
|---|---|
Demos/ |
work/demos/ |
Labs/ |
work/labs/ |
Soluciones/ |
work/solutions/ |
Datasets/ |
work/datasets/ |