PowerShell script example

Prev Next

In this example, we propose a simple script that takes the following arguments:

  • Instance URL

  • credentials,

  • Output CSV filename

  • and query ID.

The script extracts data from an existing query and saves the results to a CSV file.

It is only provided as an example. It must be adapted for production purposes.

Code example

# POWERSHELL SCRIPT: API Export to CSV (Handling Pagination)

# --- CONFIGURATION ---
$BASE_URL = "your_url" #test.zetaly.local:8443
$USERNAME = "your_username"
$PASSWORD = "your_password"
$QUERY_NAME = "your_queryname"

$OUTPUT_FILE = "api_data_export.csv"
$PAGE_SIZE = 5000 # The size of the data packets (pages)

# URL Constants
$LOGIN_URL = "https://$($BASE_URL)/zhb/api/v1/users/public/login"
$QUERIES_LIST_URL = "https://$($BASE_URL)/zde/api/v2/queries" # Nouvelle URL pour lister les requêtes

# Standard Headers
$HEADERS = @{
    "Content-Type" = "application/json"
}

Write-Host "======================================================="
Write-Host "  PowerShell Script: CSV Export with Pagination"
Write-Host "======================================================="

# --- Step 1: Authentication and Token Retrieval ---
Write-Host "`n--- Step 1: Retrieving Access Token ---"
$LoginBody = @{ username = $USERNAME; password = $PASSWORD } | ConvertTo-Json
try {
    $LoginResponse = Invoke-RestMethod -Uri $LOGIN_URL -Method Post -Headers $HEADERS -Body $LoginBody
    $ACCESS_TOKEN = $LoginResponse.token
    if ([string]::IsNullOrEmpty($ACCESS_TOKEN)) { Write-Error "❌ ERROR: Failed to extract the token."; exit 1 }
} catch {
    Write-Error "ERROR during authentication: $($_.Exception.Message)"; exit 1
}
$HEADERS["token"] = "$ACCESS_TOKEN"
Write-Host "Access token retrieved."
# -----------------------------------------------------------------


# --- Step 2: Retrieve QUERY_ID from the list of queries (using $QUERY_NAME) ---
Write-Host "`n--- Step 2: Searching for QUERY_ID for name '$QUERY_NAME' ---"
$QUERY_ID = $null

try {
    # 1. Retrieve the list of all available queries
    $QueriesListResponse = Invoke-RestMethod -Uri $QUERIES_LIST_URL -Method Get -Headers $HEADERS

    # 2. Filter the array of objects on the 'label' field
    $MatchingQuery = $QueriesListResponse | Where-Object { $_.label -eq $QUERY_NAME } | Select-Object -First 1

    if (-not $MatchingQuery) {
        Write-Error "ERROR: No query found with the label '$QUERY_NAME'. Check the name in the API."
        exit 1
    }
    
    # 3. Extract the ID from the matching result
    $QUERY_ID = $MatchingQuery.id

    if ([string]::IsNullOrEmpty($QUERY_ID)) {
        Write-Error "ERROR: The query ID is empty for the label '$QUERY_NAME'."
        exit 1
    }
} catch {
    Write-Error "ERROR retrieving the list of queries: $($_.Exception.Message)"
    exit 1
}

Write-Host "QUERY_ID ($QUERY_ID) found for label '$QUERY_NAME'."
# -----------------------------------------------------------------


$QUERY_INFO_URL = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)"

# Construct dependent URLs
#$FIELDS_URL = "https://$($BASE_URL)/zde/api/v2/datasets/$($DATASET_ID)/related/$($DATASET_ID)/fields"
$FIELDS_URL = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)"
# MODIFIED: Creating a URL "Template" for pagination
$DATA_URL_TEMPLATE = "https://$($BASE_URL)/zde/api/v2/queries/$($QUERY_ID)/data?size=$($PAGE_SIZE)&format=OBJECT"
# -----------------------------------------------------------------

# --- Step 3: Retrieving Field Names (Schema) ---
Write-Host "`n--- Step 4: Retrieving Field Schema ---"
try {
    $FieldsResponse = Invoke-RestMethod -Uri $FIELDS_URL -Method Get -Headers $HEADERS
    $FIELD_NAMES_ARRAY = $FieldsResponse | Select-Object -ExpandProperty fields 
    if ($FIELD_NAMES_ARRAY.Count -eq 0) { Write-Error "❌ ERROR: Failed to extract field names."; exit 1 }
} catch { Write-Error "❌ ERROR retrieving fields: $($_.Exception.Message)"; exit 1 }
Write-Host "Fields extracted: $($FIELD_NAMES_ARRAY -join ', ' | Select-Object -First 50)..."
# -----------------------------------------------------------------

# --- Step 4: Data Retrieval (PAGINATED) and CSV EXPORT ---
Write-Host "`n--- Step 5: Data Retrieval (PAGINATED) and CSV Export ---"

# Use an ArrayList to efficiently collect data from all pages
$AllData = [System.Collections.ArrayList]::new()
$Page = 0
$TotalItems = 1 # Fictional initial value to start the loop
$ItemsReceived = 0

try {
    # Loop as long as we haven't received all items
    while ($ItemsReceived -lt $TotalItems) {
        
        $CurrentDataUrl = "$($DATA_URL_TEMPLATE)&page=$($Page)"
		
        Write-Host "Retrieving page $Page..." -ForegroundColor Cyan
		Write-Host $CurrentDataUrl

        # Use Invoke-WebRequest to access response headers
        $Response = Invoke-WebRequest -UseBasicParsing -Uri $CurrentDataUrl -Headers $HEADERS -Method Get

        # Convert the JSON content (which is a string) into PowerShell objects
        $PageData = $Response.Content | ConvertFrom-Json
        
        if (-not $PageData -or $PageData.Count -eq 0) {
            Write-Host "  ... no more data found (empty response). Ending loop."
            break # Exit the while loop
        }

        # Add the page's data to the total list
        $AllData.AddRange($PageData)
        $ItemsReceived += $PageData.Count

        # On the first request (Page=1), read the 'content-range' header to determine the total
        if ($Page -eq 0) {
            $ContentRangeHeader = $Response.Headers["content-range"]
            
            # Use regex to extract the total (e.g., 1000-1999/103761)
            if ($null -ne $ContentRangeHeader -and $ContentRangeHeader -match "(\d+)-(\d+)/(\d+)") {
                $TotalItems = [int]$matches[3]
                Write-Host "  ... total items to retrieve: $TotalItems"
            } else {
                Write-Warning "'content-range' header not found or in unknown format ('$ContentRangeHeader')."
                Write-Warning "Assuming a single page of data."
                $TotalItems = $ItemsReceived # Forces loop exit after this page
            }
        }
        
        Write-Host "  ... $($PageData.Count) items received. Cumulative total: $ItemsReceived / $TotalItems"
        $Page++ # Increment for the next page
    }

    # --- Final Export (after the loop) ---
    Write-Host "`nExporting $($AllData.Count) total items to $OUTPUT_FILE..."
    if ($AllData.Count -gt 0) {
        # Export the complete collection at once
        $AllData | Select-Object -Property $FIELD_NAMES_ARRAY | Export-Csv -Path $OUTPUT_FILE -NoTypeInformation -Delimiter ";"
        
        Write-Host "Data successfully exported to file: $OUTPUT_FILE"
    } else {
        Write-Warning "No data was retrieved in total."
    }

} catch { 
    # This error will trigger if a page fails (e.g., page 5 of 100)
    Write-Error "ERROR while retrieving data (Page $Page): $($_.Exception.Message)" 
    exit 1 
}

Write-Host "`n======================================================="
Write-Host "  Script finished."

Documentation

Here is a complete documentation for this PowerShell script.

This documentation explains what each part of the script does, why it's built that way, and how to use it.


Script Documentation: API Export to CSV (Handling Pagination)

This script automates the process of exporting data from a Zetaly API query to a local CSV file.

Its main features are:

  • Secure Authentication: It first logs in to get a temporary access token.

  • Dynamic ID Lookup: You provide a human-readable $QUERY_NAME, and the script automatically finds the required QUERY_ID and DATASET_ID from the API.

  • Schema-Aware Export: It fetches the list of field (column) names from the API to build a clean and correctly ordered CSV header.

  • Pagination: It's designed to handle very large datasets by downloading data in "pages" (chunks of 5000 rows at a time) and combining them before saving.


Configuration Block (--- CONFIGURATION ---)

This section at the top of the script contains all the variables you need to change to match your environment.

  • $BASE_URL: The root address of your Zetaly API instance (e.g., my-instance.zetaly.com).

  • $USERNAME: The username for the API account.

  • $PASSWORD: The password for the API account.

  • $QUERY_NAME: The display name (or "label") of the query you want to export. The script will use this name to find the query's internal ID.

  • $OUTPUT_FILE: The name and path for the final CSV file (e.g., C:\exports\my_data.csv).

  • $PAGE_SIZE: The number of rows to fetch in a single API call. 5000 is a good default.

Script Breakdown

Here is a step-by-step explanation of the script's logic.

Step 1: Authentication and Token Retrieval

  • Goal: Log in to the API and get an access token.

  • How it works:

    1. $LoginBody = ... | ConvertTo-Json: It creates a PowerShell object with the $USERNAME and $PASSWORD and converts it into the JSON format the API expects.

    2. Invoke-RestMethod: This command sends the JSON body to the $LOGIN_URL (an HTTP POST request).

    3. $ACCESS_TOKEN = $LoginResponse.token: It assumes the API returns a JSON object with a token property and saves its value.

    4. $HEADERS["token"] = "$ACCESS_TOKEN": It adds the new token to the $HEADERS variable. All future API calls will now be authenticated.

  • Error Handling: A try...catch block wraps the API call. If the login fails (wrong password, server down), the script will stop and print an error.

Step 2: Retrieve QUERY_ID

  • Goal: Find the internal QUERY_ID (like a GUID) that corresponds to the human-readable $QUERY_NAME you provided.

  • How it works:

    1. Invoke-RestMethod -Method Get: It calls the $QUERIES_LIST_URL to get a list of all available queries.

    2. $MatchingQuery = ... | Where-Object { $_.label -eq $QUERY_NAME }: This is the key part. It pipes the list of queries into the Where-Object cmdlet.

      • $_ is a special variable meaning "the current item being processed."

      • It filters the list to find the one query where the label property exactly matches your $QUERY_NAME.

    3. | Select-Object -First 1: This takes the first match (just in case of duplicates) to be safe.

    4. $QUERY_ID = $MatchingQuery.id: It extracts the id property from the query object it found.

Step 3: Retrieving Field Names (Schema)

  • Goal: Get the list of all column names in the dataset. This is essential for creating a correct CSV header.

  • How it works:

    1. Invoke-RestMethod: It calls the $FIELDS_URL to get a list of field objects.

    2. $FIELD_NAMES_ARRAY = ... | Select-Object -ExpandProperty name: This command is very useful. Instead of a list of complex objects (each with name, type, id, etc.), -ExpandProperty name plucks only the name value from each object and creates a simple array of strings (e.g., ["column_a", "column_b", "column_c"]).

Step 4: Data Retrieval (PAGINATED) and CSV EXPORT

This is the most complex and important step.

  • Goal: Download all data, page by page, and then export the complete collection to a single CSV file.

  • Initialization:

    • $AllData = [System.Collections.ArrayList]::new(): It creates an ArrayList. This is a special type of list that is much more efficient for adding thousands of items in a loop compared to a standard PowerShell array (@()).

    • $Page = 0: Sets the starting page index to 0.

    • $TotalItems, $ItemsReceived: These are counters to control the loop.

  • The while Loop:

    • The loop continues as long as $ItemsReceived is less than $TotalItems.

    • Invoke-WebRequest: Note the switch! This step uses Invoke-WebRequest instead of Invoke-RestMethod. This is done on purpose because Invoke-WebRequest gives full access to the response headers, which is necessary for pagination.

    • $PageData = $Response.Content | ConvertFrom-Json: It takes the raw JSON text from the response (.Content) and converts it into PowerShell objects.

    • $AllData.AddRange($PageData): It adds the entire array of data from the current page to the main $AllData list.

  • Pagination Logic:

    • if ($Page -eq 0): This block only runs one time, on the very first request.

    • $ContentRangeHeader = $Response.Headers["content-range"]: It reads the content-range header from the API's response. This header typically looks like items 0-4999/87500.

    • if ($ContentRangeHeader -match "(\d+)-(\d+)/(\d+)"): It uses a regular expression (regex) to parse this header.

    • $TotalItems = [int]$matches[3]: It extracts the third matched group ($matches[3]), which is the total number of items (87500 in the example), and saves it.

    • This is how the script learns the total size of the dataset and knows when to stop looping.

  • Loop Control:

    • $Page++: After processing a page, it increments the page counter to ask for the next one (page 1, page 2, etc.).

    • if (-not $PageData): If the API returns an empty page (meaning no more data), the break command exits the loop early.

  • Final Export:

    • After the while loop finishes, $AllData contains all the data from all pages.

    • $AllData | Select-Object -Property $FIELD_NAMES_ARRAY: This command is crucial. It ensures that only the properties listed in $FIELD_NAMES_ARRAY (from Step 4) are selected, and that they are in the correct order.

    • | Export-Csv ...: The final, ordered data is piped to Export-Csv.

      • -NoTypeInformation: This removes the #TYPE System.Management.Automation.PSCustomObject header that PowerShell adds by default.

      • -Delimiter ";": This sets the CSV delimiter to a semicolon, which is common for opening files in European versions of Excel.


How to Use

  1. Save the script: Save the code as a .ps1 file (e.g., export.ps1).

  2. Edit the Configuration: Open the file and modify the variables in the --- CONFIGURATION --- block at the top to match your API, credentials, and desired query.

  3. Run the script: Open a PowerShell terminal, navigate to the folder where you saved the script, and run it by typing:

    PowerShell

    .\export.ps1
    
  4. Check the output: The script will print its progress for each step. When it's finished, you will find the exported data in the $OUTPUT_FILE you specified (e.g., api_data_export.csv).

We are pleased to announce the release of Zetaly Suite version 5.1. This major update focuses on improving data ingestion, facilitating data exploration, and strengthening performance to manage even higher volumes, particularly for mainframe data (SMF).