PowerShell script example

Prev Next

In this example, we propose a simple script that takes the following arguments:

  • Instance URL

  • credentials,

  • Output CSV filename

  • and query ID.

The script extracts data from an existing query and saves the results to a CSV file.

It is only provided as an example. It must be adapted for production purposes.

Code example

# POWERSHELL SCRIPT: API Export to CSV (Handling Pagination)

# --- CONFIGURATION ---
$BASE_URL = "your_url" #test.zetaly.local:8443
$USERNAME = "your_username"
$PASSWORD = "your_password"
$QUERY_NAME = "your_queryname"

$OUTPUT_FILE = "api_data_export.csv"
$PAGE_SIZE = 5000 # The size of the data packets (pages)

# URL Constants
$LOGIN_URL = "https://$($BASE_URL)/zhb/api/v1/users/public/login"
$QUERIES_LIST_URL = "https://$($BASE_URL)/zde/api/v2/queries" # Nouvelle URL pour lister les requêtes

# Standard Headers
$HEADERS = @{
    "Content-Type" = "application/json"
}

Write-Host "======================================================="
Write-Host "  PowerShell Script: CSV Export with Pagination"
Write-Host "======================================================="

# --- Step 1: Authentication and Token Retrieval ---
Write-Host "`n--- Step 1: Retrieving Access Token ---"
$LoginBody = @{ username = $USERNAME; password = $PASSWORD } | ConvertTo-Json
try {
    $LoginResponse = Invoke-RestMethod -Uri $LOGIN_URL -Method Post -Headers $HEADERS -Body $LoginBody
    $ACCESS_TOKEN = $LoginResponse.token
    if ([string]::IsNullOrEmpty($ACCESS_TOKEN)) { Write-Error "❌ ERROR: Failed to extract the token."; exit 1 }
} catch {
    Write-Error "ERROR during authentication: $($_.Exception.Message)"; exit 1
}
$HEADERS["token"] = "$ACCESS_TOKEN"
Write-Host "Access token retrieved."
# -----------------------------------------------------------------


# --- Step 2: Retrieve QUERY_ID from the list of queries (using $QUERY_NAME) ---
Write-Host "`n--- Step 2: Searching for QUERY_ID for name '$QUERY_NAME' ---"
$QUERY_ID = $null

try {
    # 1. Retrieve the list of all available queries
    $QueriesListResponse = Invoke-RestMethod -Uri $QUERIES_LIST_URL -Method Get -Headers $HEADERS

    # 2. Filter the array of objects on the 'label' field
    $MatchingQuery = $QueriesListResponse | Where-Object { $_.label -eq $QUERY_NAME } | Select-Object -First 1

    if (-not $MatchingQuery) {
        Write-Error "ERROR: No query found with the label '$QUERY_NAME'. Check the name in the API."
        exit 1
    }
    
    # 3. Extract the ID from the matching result
    $QUERY_ID = $MatchingQuery.id

    if ([string]::IsNullOrEmpty($QUERY_ID)) {
        Write-Error "ERROR: The query ID is empty for the label '$QUERY_NAME'."
        exit 1
    }
} catch {
    Write-Error "ERROR retrieving the list of queries: $($_.Exception.Message)"
    exit 1
}

Write-Host "QUERY_ID ($QUERY_ID) found for label '$QUERY_NAME'."
# -----------------------------------------------------------------


$QUERY_INFO_URL = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)"

# Construct dependent URLs
#$FIELDS_URL = "https://$($BASE_URL)/zde/api/v2/datasets/$($DATASET_ID)/related/$($DATASET_ID)/fields"
$FIELDS_URL = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)"
# MODIFIED: Creating a URL "Template" for pagination
$DATA_URL_TEMPLATE = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)/data?size=$($PAGE_SIZE)&format=OBJECT"
# -----------------------------------------------------------------

# --- Step 3: Retrieving Field Names (Schema) ---
Write-Host "`n--- Step 4: Retrieving Field Schema ---"
try {
    $FieldsResponse = Invoke-RestMethod -Uri $FIELDS_URL -Method Get -Headers $HEADERS
    $FIELD_NAMES_ARRAY = $FieldsResponse | Select-Object -ExpandProperty fields 
    if ($FIELD_NAMES_ARRAY.Count -eq 0) { Write-Error "❌ ERROR: Failed to extract field names."; exit 1 }
} catch { Write-Error "❌ ERROR retrieving fields: $($_.Exception.Message)"; exit 1 }
Write-Host "Fields extracted: $($FIELD_NAMES_ARRAY -join ', ' | Select-Object -First 50)..."
# -----------------------------------------------------------------

# --- Step 4: Data Retrieval (PAGINATED) and CSV EXPORT ---
Write-Host "`n--- Step 5: Data Retrieval (PAGINATED) and CSV Export ---"

# Use an ArrayList to efficiently collect data from all pages
$AllData = [System.Collections.ArrayList]::new()
$Page = 0
$TotalItems = 1 # Fictional initial value to start the loop
$ItemsReceived = 0

try {
    # Loop as long as we haven't received all items
    while ($ItemsReceived -lt $TotalItems) {
        
        $CurrentDataUrl = "$($DATA_URL_TEMPLATE)&page=$($Page)"
		
        Write-Host "Retrieving page $Page..." -ForegroundColor Cyan
		Write-Host $CurrentDataUrl

        # Use Invoke-WebRequest to access response headers
        $Response = Invoke-WebRequest -UseBasicParsing -Uri $CurrentDataUrl -Headers $HEADERS -Method Get

        # Convert the JSON content (which is a string) into PowerShell objects
        $PageData = $Response.Content | ConvertFrom-Json
        
        if (-not $PageData -or $PageData.Count -eq 0) {
            Write-Host "  ... no more data found (empty response). Ending loop."
            break # Exit the while loop
        }

        # Add the page's data to the total list
        $AllData.AddRange($PageData)
        $ItemsReceived += $PageData.Count

        # On the first request (Page=1), read the 'content-range' header to determine the total
        if ($Page -eq 0) {
            $ContentRangeHeader = $Response.Headers["content-range"]
            
            # Use regex to extract the total (e.g., 1000-1999/103761)
            if ($null -ne $ContentRangeHeader -and $ContentRangeHeader -match "(\d+)-(\d+)/(\d+)") {
                $TotalItems = [int]$matches[3]
                Write-Host "  ... total items to retrieve: $TotalItems"
            } else {
                Write-Warning "'content-range' header not found or in unknown format ('$ContentRangeHeader')."
                Write-Warning "Assuming a single page of data."
                $TotalItems = $ItemsReceived # Forces loop exit after this page
            }
        }
        
        Write-Host "  ... $($PageData.Count) items received. Cumulative total: $ItemsReceived / $TotalItems"
        $Page++ # Increment for the next page
    }

    # --- Final Export (after the loop) ---
    Write-Host "`nExporting $($AllData.Count) total items to $OUTPUT_FILE..."
    if ($AllData.Count -gt 0) {
        # Export the complete collection at once
        $AllData | Select-Object -Property $FIELD_NAMES_ARRAY | Export-Csv -Path $OUTPUT_FILE -NoTypeInformation -Delimiter ";"
        
        Write-Host "Data successfully exported to file: $OUTPUT_FILE"
    } else {
        Write-Warning "No data was retrieved in total."
    }

} catch { 
    # This error will trigger if a page fails (e.g., page 5 of 100)
    Write-Error "ERROR while retrieving data (Page $Page): $($_.Exception.Message)" 
    exit 1 
}

Write-Host "`n======================================================="
Write-Host "  Script finished."

Documentation

Here is a complete documentation for this PowerShell script.

This documentation explains what each part of the script does, why it's built that way, and how to use it.


Script Documentation: API Export to CSV (Handling Pagination)

This script automates the process of exporting data from a Zetaly API query to a local CSV file.

Its main features are:

  • Secure Authentication: It first logs in to get a temporary access token.

  • Dynamic ID Lookup: You provide a human-readable $QUERY_NAME, and the script automatically finds the required QUERY_ID and DATASET_ID from the API.

  • Schema-Aware Export: It fetches the list of field (column) names from the API to build a clean and correctly ordered CSV header.

  • Pagination: It's designed to handle very large datasets by downloading data in "pages" (chunks of 5000 rows at a time) and combining them before saving.


Configuration Block (--- CONFIGURATION ---)

This section at the top of the script contains all the variables you need to change to match your environment.

  • $BASE_URL: The root address of your Zetaly API instance (e.g., my-instance.zetaly.com).

  • $USERNAME: The username for the API account.

  • $PASSWORD: The password for the API account.

  • $QUERY_NAME: The display name (or "label") of the query you want to export. The script will use this name to find the query's internal ID.

  • $OUTPUT_FILE: The name and path for the final CSV file (e.g., C:\exports\my_data.csv).

  • $PAGE_SIZE: The number of rows to fetch in a single API call. 5000 is a good default.

Script Breakdown

Here is a step-by-step explanation of the script's logic.

Step 1: Authentication and Token Retrieval

  • Goal: Log in to the API and get an access token.

  • How it works:

    1. $LoginBody = ... | ConvertTo-Json: It creates a PowerShell object with the $USERNAME and $PASSWORD and converts it into the JSON format the API expects.

    2. Invoke-RestMethod: This command sends the JSON body to the $LOGIN_URL (an HTTP POST request).

    3. $ACCESS_TOKEN = $LoginResponse.token: It assumes the API returns a JSON object with a token property and saves its value.

    4. $HEADERS["token"] = "$ACCESS_TOKEN": It adds the new token to the $HEADERS variable. All future API calls will now be authenticated.

  • Error Handling: A try...catch block wraps the API call. If the login fails (wrong password, server down), the script will stop and print an error.

Step 2: Retrieve QUERY_ID

  • Goal: Find the internal QUERY_ID (like a GUID) that corresponds to the human-readable $QUERY_NAME you provided.

  • How it works:

    1. Invoke-RestMethod -Method Get: It calls the $QUERIES_LIST_URL to get a list of all available queries.

    2. $MatchingQuery = ... | Where-Object { $_.label -eq $QUERY_NAME }: This is the key part. It pipes the list of queries into the Where-Object cmdlet.

      • $_ is a special variable meaning "the current item being processed."

      • It filters the list to find the one query where the label property exactly matches your $QUERY_NAME.

    3. | Select-Object -First 1: This takes the first match (just in case of duplicates) to be safe.

    4. $QUERY_ID = $MatchingQuery.id: It extracts the id property from the query object it found.

Step 3: Retrieving Field Names (Schema)

  • Goal: Get the list of all column names in the dataset. This is essential for creating a correct CSV header.

  • How it works:

    1. Invoke-RestMethod: It calls the $FIELDS_URL to get a list of field objects.

    2. $FIELD_NAMES_ARRAY = ... | Select-Object -ExpandProperty name: This command is very useful. Instead of a list of complex objects (each with name, type, id, etc.), -ExpandProperty name plucks only the name value from each object and creates a simple array of strings (e.g., ["column_a", "column_b", "column_c"]).

Step 4: Data Retrieval (PAGINATED) and CSV EXPORT

This is the most complex and important step.

  • Goal: Download all data, page by page, and then export the complete collection to a single CSV file.

  • Initialization:

    • $AllData = [System.Collections.ArrayList]::new(): It creates an ArrayList. This is a special type of list that is much more efficient for adding thousands of items in a loop compared to a standard PowerShell array (@()).

    • $Page = 0: Sets the starting page index to 0.

    • $TotalItems, $ItemsReceived: These are counters to control the loop.

  • The while Loop:

    • The loop continues as long as $ItemsReceived is less than $TotalItems.

    • Invoke-WebRequest: Note the switch! This step uses Invoke-WebRequest instead of Invoke-RestMethod. This is done on purpose because Invoke-WebRequest gives full access to the response headers, which is necessary for pagination.

    • $PageData = $Response.Content | ConvertFrom-Json: It takes the raw JSON text from the response (.Content) and converts it into PowerShell objects.

    • $AllData.AddRange($PageData): It adds the entire array of data from the current page to the main $AllData list.

  • Pagination Logic:

    • if ($Page -eq 0): This block only runs one time, on the very first request.

    • $ContentRangeHeader = $Response.Headers["content-range"]: It reads the content-range header from the API's response. This header typically looks like items 0-4999/87500.

    • if ($ContentRangeHeader -match "(\d+)-(\d+)/(\d+)"): It uses a regular expression (regex) to parse this header.

    • $TotalItems = [int]$matches[3]: It extracts the third matched group ($matches[3]), which is the total number of items (87500 in the example), and saves it.

    • This is how the script learns the total size of the dataset and knows when to stop looping.

  • Loop Control:

    • $Page++: After processing a page, it increments the page counter to ask for the next one (page 1, page 2, etc.).

    • if (-not $PageData): If the API returns an empty page (meaning no more data), the break command exits the loop early.

  • Final Export:

    • After the while loop finishes, $AllData contains all the data from all pages.

    • $AllData | Select-Object -Property $FIELD_NAMES_ARRAY: This command is crucial. It ensures that only the properties listed in $FIELD_NAMES_ARRAY (from Step 4) are selected, and that they are in the correct order.

    • | Export-Csv ...: The final, ordered data is piped to Export-Csv.

      • -NoTypeInformation: This removes the #TYPE System.Management.Automation.PSCustomObject header that PowerShell adds by default.

      • -Delimiter ";": This sets the CSV delimiter to a semicolon, which is common for opening files in European versions of Excel.


How to Use

  1. Save the script: Save the code as a .ps1 file (e.g., export.ps1).

  2. Edit the Configuration: Open the file and modify the variables in the --- CONFIGURATION --- block at the top to match your API, credentials, and desired query.

  3. Run the script: Open a PowerShell terminal, navigate to the folder where you saved the script, and run it by typing:

    PowerShell

    .\export.ps1
    
  4. Check the output: The script will print its progress for each step. When it's finished, you will find the exported data in the $OUTPUT_FILE you specified (e.g., api_data_export.csv).