This guide provides step-by-step instructions on using Copilot and other AI tools within Microsoft Fabric to build an end-to-end flight delay prediction solution. Visitors can follow these instructions to learn how to effectively leverage AI capabilities in Microsoft Fabric.
fabric-flight-ai-demoFirst, you need to create a Microsoft Fabric workspace:


fabric-flight-ai-demo (or your preferred name)Step 1: Select Lakehouse from New Item

Step 2: Name the Lakehouse

<prefix>_flightdelay_lakehouse (e.g., demo_flightdelay_lakehouse)Step 1: Navigate to Files and Upload

Step 2: Upload the CSV File

flights_sample_3m.csv fileNote: This step can be skipped. You can now convert CSV files directly to tables using the Load to Tables functionality. Simply right-click on the uploaded file in your Lakehouse and select Load to Tables > New table.
Step 1: Create New Dataflow Gen2

Step 2: Rename the Dataflow

dfg2_flightdelay_prep (using underscores per naming convention)Step 1: Get Data from Lakehouse

Step 2: Create Connection to Lakehouse

Step 3: Choose the CSV File

flights_sample_3m.csvLoad the CSV file from Lakehouse named `flights_sample_3m.csv`
This step demonstrates how Copilot can be used to clean data before analysis — it is an example of what’s possible when preparing datasets for ML workflows in Fabric.
Remove all rows where the value in the column `ARR_DELAY` is null.
IS_DELAYED:Create a new column `IS_DELAYED`.
Set its value to 1 if `ARR_DELAY` > 15, otherwise 0.
DEP_HOUR:Create a column `DEP_HOUR` by extracting hour from `CRS_DEP_TIME` (e.g. 1530 → 15).
FL_DAYOFWEEK:Extract day of the week from `FL_DATE` and store in `FL_DAYOFWEEK`. Monday = 1.
FL_MONTH:Extract month from `FL_DATE` and store in `FL_MONTH`. Use Date.Month([FL_DATE]).
The Warehouse is chosen for storing processed data to facilitate efficient data retrieval and seamless integration with Power BI for analytics and reporting purposes.
flightdelay-featuresStep 1: Create New Data Agent

Step 2: Define the Name for Data Agent

agent-flightdelay-qnaStep 3: Connect Data Agent to Lakehouse

Step 4: Enable the Table for Data Agent

Simple Queries:
What percentage of flights were delayed?Which hour of the day has the highest number of delays?How often do flights get delayed on each day of the week?Show me the top 5 airports with the most delays.Which airlines have the highest average arrival delay?What is the monthly trend of delayed flights?More Complex Queries:
Compare delay rates between weekdays and weekends.Are longer flights more likely to be delayed?What is the most common reason for delays among delayed flights?
The Data Agent can answer natural language questions about your flight data and provides:
This demonstrates how the Data Agent translates business questions into SQL queries and returns actionable insights from your flight delay dataset.
nb-flightdelay-modelLoad the `flightdelay-features` table into a Spark DataFrame.
Prepare the data for binary classification on `IS_DELAYED`.
Apply cleaning and encoding automatically.
Train a binary classification model to predict `IS_DELAYED`.
Split into train/test sets, fit the model, and evaluate its performance.
During this step, you will discover which features most influence flight delays. For example, in the test scenario, the hour of departure (DEP_HOUR) had the strongest predictive power — flights later in the day are generally more prone to delays. In contrast, features like distance or specific origin/destination airports had much less influence.
This helps demonstrate how machine learning can uncover non-obvious patterns in flight data and guide operational improvements or forecasting strategies.
Show feature importance (bar chart), a confusion matrix (heatmap), and delay rate by `DEP_HOUR` (line chart).
Create a DataFrame with a flight:
- DEP_HOUR = 18
- FL_DAYOFWEEK = 5
- FL_MONTH = 12
- AIRLINE_CODE = "UA"
- ORIGIN = "ORD"
- DEST = "LGA"
- DISTANCE = 733
Apply preprocessing, predict `IS_DELAYED`, and print:
- Class (0 or 1)
- Probability
- Message ("likely to be delayed" or not)
- Confidence %
flightdelay-features, create a semantic modelsm_flightdelay_predictionThe semantic model provides a structured layer over your data to simplify building Power BI reports. It helps streamline data access, organize fields for analysis, and enable self-service reporting experiences.
Step 1: Access Auto-Create Report Option

sm_flightdelay_predictionStep 2: Review Generated Report

DEP_HOUR) had the strongest impact on delays, while other features like distance or route have minimal influence.All these steps can be conducted without writing a single line of code — this lab shows how you can use Copilot to streamline your analytical work and boost productivity.
This lab showcases how Microsoft Fabric Copilot helps reduce friction across the full data-to-insight workflow.
Notebook & Automation:
Semantic Model:
Power BI Copilot:
Power BI MCP (Model Context Protocol):
I will be happy to hear your feedback or answer any questions. You can contact me via LinkedIn: aka.ms/taras.