Introduction to practical Query folding

One of the most powerful capabilities of Power Query and the M Language is Query Folding (also referred to as query delegation, and predicate push-down). Query Folding allows the Power Query Mashup Engine to push the transformations expressed in an M (mashup) query to the data source, in the data source’s query language, resulting in more efficient data processing.

That's So Meta

I don't know about you, but my goodness, wouldn't it be great to press Close & Apply on Power Query and instantly have all of your columns documented in your model based on metadata definitions in your query? And shoot, why stop there? Let's just leave the column titles exactly as they are from the original source system for easier tracking with our lineage but update to friendlier display names while in the semantic model. It almost sounds too good to be true!

Cleaning the (Staging) Lakeside

Dirty lakes need cleaning, simple as that - and it's no different when it comes to Dataflow Gen2's staging feature. You might be wondering, "What's the big deal?" Well, let's dive into and dissect the backend staging setup of dataflows in the Lakehouse. Close your eyes and imagine countless system tables, each tagged with lengthy identifiers that intertwine with your actual query names—think some-random-crazystring_Address_002E.parquet. And within these auxiliary tables? YOUR DATA alongside generic column titles like Column1, Column2, Column3 (metadata too! don't worry your column names aren't lost!). But every single query leveraging staging, is using this backend implementation approach - ultimately contributing to data sprawl in my opinion.

From Pipelines to Table: The Outputs We Ingest

If it’s important, you should probably be logging it. With KQL Databases and the newly introduced Semantic model refresh activity within Data pipelines, its just way too easy (2EZ) now. Before we dive in, let’s look at a few prerequisite items needed for this article:

CHANGE (IN THE HOUSE OF LAKES)

Data Factory in Microsoft Fabric is an AMAZING tool that allows us to combine the flexibility of Data pipelines and ease of a Dataflow Gen2 to create some nifty solutions with a tiny smidge of code. One of the most common use cases is the ability to load new data into a destination for a select period (incrementally). Below is a high-level diagram that we’ll start breaking down to copy data from a SQL Database into a Lakehouse.

It's a LongType() to the top (If you want to rock and roll)

I’ll always be the first to admit that the things that interest me may be of little (technical) interest to you. However, I figure “Hey, why not just start documenting my random gibberish findings?!” and well, here we are with a short but sweet article on Dataflows Gen2 in Microsoft Fabric.

5 Tips for Learning Data Types in Power Query Formula Language

  1. Understand the basics of data types in Power Query
    • Data types in Power Query are used to classify values to have a more structured dataset. Data types are defined at the field level—values inside a field are set to conform to the data type of the field. The data type of a column is displayed on the left side of the column heading with an icon that symbolizes the data type.

Yo, Listen!

There is a bombardment of information constantly vying for our attention - mobile phones, smart TVs, that 1996 Space Jam website and of course endless cat videos, which is why I believe the advances in low-code automation have been an incredible benefit for sharing (the most relevant) information with others.

I Always Feel Like SUMPRODUCT()'s Watching Me

As the war wages on between #TEAMVLOOKUP and #TEAMINDEXMATCH, a new challenger has arisen. #SUMPRODUCT, a function often associated with math and trig, has now taken on a new purpose - Boolean logic. But can it compete? Or, will it continue to lurk in the shadows, only to be utilized by Power Users...