Course DP-600: Microsoft Fabric Analytics Engineer

 

Delivery Hints

The slide decks do not align with the learn.microsoft.com content.

To be honest, the ordering of the slide topics and the learn.microsoft.com topics is a bit of a mess. In particular, the last section in day 4 (Administer Microsoft Fabric) really needs to be taught at the start of the course!

 

General Notes

Terminology

https://learn.microsoft.com/en-us/fabric/get-started/fabric-terminology

Experience, Workload

"Experience: A collection of capabilities targeted to a specific functionality. The Fabric experiences include Fabric Data Warehouse, Fabric Data Engineering, Fabric Data Science, Real-Time Intelligence, Data Factory, and Power BI."

"Workloads expand functionality in Fabric. Users with the relevant permissions can add workloads and make them available either to the entire tenant or to a specific capacity."

The key point is that while an Experience is similar to a Workload, and many have the same name, the two things are not the same. Sadly, the tools and documentation often misuse the two terms.

Workloads are collections of frontend and backend functions. The workloads installed in your Fabric account determine what items are available to create. In the Fabric admin center, use the Workloads item in the Navbar to see the Workloads that are currently installed. The Create item in the Navbar groups items by Workload (the +New Item button on the menubar does not do this). In item lists (for example in a Workspace) the Filter option at the top-right allows filtering by Workload. However, some items are in different Workloads in the create screen and the filter screen (for example, Notebooks).

Experiences are marketing terms for functionality in Microsoft Fabric. As far as I can see, the term "Experience" is not used by the current Fabric admin center, just by the learn.microsoft.com articles.

Admin Centers

The Fabric admin center (app.fabric.microsoft.com) and the Power BI admin center (app.powerbi.com) are the same tool. The bottom button on the Navbar allows you to change between the two UIs.

Data Stores

https://learn.microsoft.com/en-nz/fabric/get-started/decision-guide-lakehouse-warehouse

https://learn.microsoft.com/en-nz/fabric/get-started/decision-guide-data-store

Note that Microsoft have a number of other data services, including Azure SQL Database (as well as PaaS services for HorizonDB, PostgreSQL, MySQL, Cassandra, MongoDB, and others), Azure Cosmos DB, Databricks, Azure Synapse Analytics (which is no longer receiving feature updates), Microsoft Dataverse (which hardly anyone knows about), Azure Managed Redis, Azure Analysis Services, and probably others I've missed.

 

Day 1

OneLake

Note that there is only one OneLake per tenant. This is to avoid data silos.

Scala

I don't think I am being picky when I say that Scala is not a "Java-based scripting language". It's a statically-typed, compiled, object-oriented programming language that targets the Java VM.

"In practice, most data engineering and analytics workloads are accomplished using a combination of PySpark and Spark SQL." [citation required]
:-)

Module: Work with Delta Lake tables in Microsoft Fabric

Review question 1

The answer they give is a horrible description of Delta Lake

From delta.io:
Delta Lake is an open-source storage framework that enables building a format agnostic Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, Hive, Snowflake, Google BigQuery, Athena, Redshift, Databricks, Azure Fabric and APIs for Scala, Java, Rust, and Python. With Delta Universal Format aka UniForm, you can read now Delta tables with Iceberg and Hudi clients.

Optimize delta tables

https://delta.io/blog/2023-01-25-delta-lake-small-file-compaction-optimize/

 

Day 2

 

Day 3

 

Day 4