Have you had issues importing data from an OData service? Has your OData table taken too long to load into Power Query? Have you had data size restrictions from an OData service that made it impossible to load the entire data? Then this blog is for you.
In the era of big data, OData services are storing more and more data to optimize data extraction from the OData service. Pagination is a strategy to ensure smooth and seamless data extraction.
In this blog, we will walk through how to dynamically paginate data from an OData Service using Power Query.
Sections
- What Is Pagination?
- The Data
- Step 1 โ Get Total Record Count
- Step 2 โ Create Base Query Function
- Step 3 โ Create Pagination Function Using List.generate
- Step 4 โ Test Functions
- Step 5 โ Parameterize the Base Query
- Conclusion
What Is Pagination?
Pagination for data engineers is the process of dividing large data into manageable data sizes. In the context of this blog, pagination returns data based on user-defined parameters.
For example, if your data contains 10,000 records, and you want to present it in two tables, you will define each table size as 5000.
Pagination helps conserve resources, such as bandwidth or memory, and improve the response time for the OData request and end-user experience.
There are different types of pagination, the one covered in this blog will be offset-based pagination. Offset-based pagination allows the client to define the number of records to skip and the maximum number of records to return in a data query.
Offset-based pagination achieves this by using the $top (maximum record to return) and $skip (record to skip) system query option in OData. Learn more about OData and its query options here.
The Data
This blog will show you how to perform offset-based pagination on a dataset. For this blog, the data we will be connecting to is the Microsoft AdventureWorks sample database via its OData URL.
This URL exposes a limited subset of the AdventureWorks database. Tables available in this OData service are:
- CompanySales
- ManufacturingInstructions
- ProductCatalog
- TerritorySalesDrilldown
- WorkOrderRouting
This walk-through example focuses on the TerritorySalesDrilldown table. You can access the OData query to this table here.
From here on, this will be referred to as the base query.
Step 1 โ Get Total Record Count
The first step to paginate your data is to determine the number of rows in your dataset. You can achieve this by using the $inlinecount system query option that OData provides.
This stipulates that the response must include the total record count in the table request. To learn more about how $inlinecount works please visit this link.
The query used to get the total record count can be found here.
The above query would return the total record count in the table metadata and all the records. This is inefficient, so we will modify the query to return the total record count with only the first row of data.
This is done by adding the $top system query option to our base query found here.
To retrieve the row count in Power Query, use the Web connector.
In Power Query, click New Source and select Web in the list of Most Common.
Paste the total record count query into the From Web pop-up box and click OK.
If the Access Web content pop up box shows up, select Anonymous and click Connect
Double click on service.odata.org 721 bytes to open it. A table that contains information about the data is created.
Under the column titled http://schemas.microsoft.com/ado/2007/08/dataservices/metadata, click Table. The table contains the total record count for the TerritorySalesDrilldown table.
The new table provides the total row count of the table as 3703.
To use the data for pagination, we need to convert the table into a number.
Right-click on row 1 and select Drill Down.
In the Query Settings pane, navigate to Properties and under the subsection Name change the table name from TerritorySalesDrilldown?$inlinecount=allpage to TerritorySalesDrilldown
We have now successfully converted our table to a number, which can be used for pagination. The next step is to create the base query function.
Step 2 โ Create Base Query Function
To create the function you must first create a blank query. In Power Query, click New Source and select Blank Query in the list of Most Common.
In the Query Setting pane, change the Name under Properties from Query1 to Offset Base Query
Before editing in the advance editor, modify the base query using $top, $skip, and $orderby system query option.
Base Query:
https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown
Modified Base Query:
https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown?$top=&$skip=&$orderby=
Now that the base query has been modified, select the Offset Base Query in the query pane, then click on Home, then Advanced Editor.
With the Advanced Editor open, connect to the modified base query with the OData connector.
Source = OData.Feed(“https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown?$top=”&Number.ToText()&”&$skip=”&Number.ToText()&”&$orderby=”)
Create parameters that would be used as input values for $top, $skip, and $orderby
Source = OData.Feed("https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown?$top="&Number.ToText()&"&$skip="&Number.ToText()&"&$orderby=")
Create parameters that would be used as input values for $top, $skip, and $orderby
(top_number as number, skip_number as number, orderby as text)=>
Type the newly created parameters as input values for $top, $skip, and $orderby system query option
Source = OData.Feed("https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown?$top="&Number.ToText(top_number)&"&$skip="&Number.ToText(skip_number)&"&$orderby="&orderby)
Offset Based Query M Code:
(top_number as number,skip_number as number,orderby as text)=> let Source = OData.Feed("https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown?$top="&Number.ToText(top_number)&"&$skip="&Number.ToText(skip_number)&"&$orderby="&orderby) in Source
Offset Base Query now becomes a function that accepts input parameters top_number, skip_number, and orderby and retrieves data specified by the entered parameters.
Step 3 โ Create Pagination Function Using List.generate
To create the pagination function, the List.Generate Power Query iterative function will be used. This function will iterate through the initial input values and stop when it reaches the total record count. A simple explanation of List.Generate function and the parameters it accepts are below:
List.Generate (
initial, Initial condition of the function
condition, Condition evaluated for each iteration
next, the next row to be evaluated
optional selector, if the condition is evaluated to true, take that result and do something additional with it)
For a more detailed explanation on List.Generate please visit the Microsoft documentation here.
Create a blank query as described earlier and name it Paginate Base Query
Open Advanced Editor and write the M Code below:
(top_rows_to_return as number, number_of_rows_to_skips as number,column_to_order_by as text)=> let Source = List.Generate(()=> [Top = top_rows_to_return, Skip = number_of_rows_to_skips,Orderby = column_to_order_by, Table = #"Offset Base Query"(Top,Skip,Orderby)], each [Skip] <= TerritorySalesDrilldown, each [Top = [Top], Skip = [Skip] + top_rows_to_return,Orderby = [Orderby], Table = #"Offset Base Query"(Top, Skip,Orderby)]) in Source
- Line 1: creates parameters for $top, $skip, and $orderby system query option.
- Line 4: start of list.generate function.
- Line 5: defines the initial conditions for the function which will be entered via the parameters in line 1. These input values are passed through to the function Offset Base Query which was created in step 2.
- Line 6: defines the condition to stop the iterative process. It will stop when it reaches the referenced total record count query (TerritorySalesDrilldown) created in step 1.
- Line 7: defines the condition to be evaluated if line 6 evaluates to false.
Step 4 โ Test Functions
Test the Paginate Base Query function by entering parameters and clicking invoke
- top_row_to_return = 1000
- number_of_rows_to_skips = 0
- column_to_order_by = ID
If the Edit Credentials window pops up, click Edit Credentials
Select Anonymous and click Connect on the Access Web content pop up
If you get the error message below
Click on File, navigate to Options and settings, then select Options
In the Options pop up, click Privacy, then select Always ignore Privacy Level settings. Once selected, click OK
Now that the privacy settings have been modified, click the Refresh Preview button on the Home tab
A new list with four rows of the record is created. Each record contains a table with 1000 rows each based on the conditions we specified in the function. To access this record, convert the list to a table by clicking To Table
Click OK, when the To Table Dialog box pops up
Expand the records in the column by the expand icon
In the new drop-down box, only select Table. Then uncheck Use original column name as prefix. Click OK.
A single table is created that contains all the records in the four tables. Letโs rename the table Invoked Function to Territory Sales Drilldown
If you reached this point, you have successfully paginated your data. Step 5 will show you how to parameterize the base query in the code above.
Step 5 โ Parameterize the Base Query
To make the functions more flexible, create a parameter for the base query.
Click on Manage Parameters then New Parameters
In the Manage Parameters box, click New
Fill the Manage Parameters pop up box with
- Name: Base Query
- Type: Text
- Suggest Value: Any value
- Current Value: https://services.odata.org/AdventureWorksV3/AdventureWorks.svc/TerritorySalesDrilldown
Once filled, click OK.
Now replace the base query in TerritorySalesDrilldown, and Offset Base Query with the new parameter Base Query.
Modified TerritorySalesDrilldown Query:
let Source = Xml.Tables(Web.Contents(#"Base Query"&"?$inlinecount=allpages&$top=1 ")), #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", type text}, {"updated", type datetime}}), #"http://schemas microsoft com/ado/2007/08/dataservices/metadata" = #"Changed Type"{0}[#"http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"], #"Changed Type1" = Table.TransformColumnTypes(#"http://schemas microsoft com/ado/2007/08/dataservices/metadata",{{"count", Int64.Type}}), count = #"Changed Type1"{0}[count] in count
Modified Offset Base Query:
(top_number as number,skip_number as number,orderby as text)=> let Source = OData.Feed(#"Base Query"&"?$top="&Number.ToText(top_number)&"&$skip="&Number.ToText(skip_number)&"&$orderby="&orderby) in Source
Now the codes and functions can be reused for different OData services without changing the M Code.
Conclusion
In this era of big data, pagination has become necessary to save resources, improve data accessibility, improve request response time for the OData request, and improve the end-user experience.
This blog has provided a walk-through example of how to paginate data. We presented three simple steps to follow, with two bonus steps to show you how to test your code and make your code more dynamic.
Feel free to follow the walk-through example and test out the benefits of pagination for yourself.