Skip to main content

Command Palette

Search for a command to run...

Automate DAX User-Defined Functions with Semantic Link Labs

Published
10 min read

If you've ever copied the same DAX measure across multiple reports and then kicked yourself when you had to update it in twenty different places, this article is for you.

DAX User Defined Function announced at FabCon was one of the biggest updates to the DAX language in recent years. You can read more about it on official Microsoft docs and SQLBI. In this blog, I share how you can use Semantic Link Labs (SLL) to automate the process of defining and centralizing the UDFs. Note DAX UDF is still in preview so read official documentation for all the details and limitations.

What Are DAX User-Defined Functions Anyway?

A DAX UDF looks like this:

FUNCTION CalculateMargin = (
    revenue : NUMERIC,
    cost : NUMERIC
) => 
    DIVIDE(revenue - cost, revenue)

You define it once in your semantic model, and then you can use CalculateMargin(revenue, cost) anywhere you'd use a regular DAX function. Your measures get cleaner, your logic stays consistent, and when the CFO decides margin should be calculated differently, you change it in one place.

Understanding How Parameters Work

This is important, and I learned it the hard way: DAX UDFs have two different ways of handling parameters, and picking the wrong one will give you incorrect results.

VAL parameters evaluate once before your function runs. Use these for simple scalar values:

FUNCTION AddTax = (amount :VAL NUMERIC) => amount * 1.1

EXPR parameters evaluate within your function's filter context. Use these when you're passing in expressions that need to respect the current row or filter state:

FUNCTION GrowthRate = (currentYear :EXPR, priorYear :EXPR) => 
    DIVIDE([currentYear] - [priorYear], [priorYear])

The difference? If you pass a measure as a VAL parameter, it calculates once using the outer context. As an EXPR parameter, it recalculates for each row context inside the function. This matters a lot when you're working with measures in your calculations.

SLL is a Python library that extends Microsoft's base Semantic Link with automation tools. The library gives you Python functions that let you read and write to your semantic models, including creating and updating UDFs.

First, install it in your Fabric notebook:

%pip install semantic-link-labs --upgrade --quiet

Then import what you need:

import sempy_labs as labs
from sempy_labs.tom import connect_semantic_model

Your First Automated UDF

Let's walk through a real example. Say you want to add a standard tax calculation function to one of your semantic models:

# Define your connection details
dataset = 'Sales_Model'
workspace = None  # Uses your current workspace
function_name = 'AddTax'
function_expression = "(amount : NUMERIC) => amount * 1.1"

# Connect to the model and add the function
with connect_semantic_model(dataset=dataset, readonly=False, workspace=workspace) as tom:
    tom.set_user_defined_function(name=function_name, expression=function_expression)

print("Function created successfully!")

# Verify it worked
functions_df = labs.list_user_defined_functions(dataset=dataset, workspace=workspace)
display(functions_df)

You've just programmatically added a UDF to your semantic model. The connect_semantic_model function opens a connection to your model's Tabular Object Model (TOM), which is essentially the behind-the-scenes structure of your semantic model. The with statement ensures the connection closes properly when you're done.

Deploying to Multiple Models

Here's where things get interesting. Once you have a UDF working in one model, you probably want it in all your models. Let's automate that:

import sempy.fabric as fabric

# Define your standard functions
standard_functions = {
    'CalculateMargin': "(revenue : NUMERIC, cost : NUMERIC) => DIVIDE(revenue - cost, revenue)",
    'ConvertToUSD': "(amount : NUMERIC, rate : NUMERIC) => amount * rate",
    'FormatCurrency': "(value : NUMERIC) => FORMAT(value, '$#,##0')"
}

# Get all semantic models in your workspace
workspace = 'Finance_Analytics'
all_models = fabric.list_datasets(workspace=workspace)

# Loop through each model and add the functions
for model_name in all_models['Dataset Name']:
    print(f"Updating {model_name}...")

    try:
        with connect_semantic_model(dataset=model_name, workspace=workspace, readonly=False) as tom:
            for func_name, func_expr in standard_functions.items():
                tom.set_user_defined_function(name=func_name, expression=func_expr)

        print(f"  ✓ {model_name} updated successfully")

    except Exception as e:
        print(f"  ✗ Failed to update {model_name}: {str(e)}")

print("\nDeployment complete!")

I usually run this in a Fabric notebook, and it takes maybe 30 seconds to update a dozen models. Compare that to opening each model in the service, manually creating functions, and hoping you didn't make any typos.

Building a Central Function Library

Here's the pattern I use in production: Create one "master" semantic model that serves as your function library. Define all your standard business logic there, then use Python to sync it to all your other models.

# Step 1: Extract functions from your master library
master_dataset = 'Corporate_Function_Library'
master_workspace = 'IT_Standards'

master_functions = labs.list_user_defined_functions(
    dataset=master_dataset, 
    workspace=master_workspace
)

print(f"Found {len(master_functions)} functions in the library")
display(master_functions)

# Step 2: Apply them to all models in a target workspace
target_workspace = 'Sales_Department'
target_models = fabric.list_datasets(workspace=target_workspace)

for model_name in target_models['Dataset Name']:
    with connect_semantic_model(dataset=model_name, workspace=target_workspace, readonly=False) as tom:
        for index, row in master_functions.iterrows():
            tom.set_user_defined_function(
                name=row['Function Name'],
                expression=row['Expression']
            )

    print(f"Synced functions to {model_name}")

This approach means your finance team defines their calculations once, and every model automatically gets the same logic. When something changes, you update the master library and re-run your sync script.

Handling Errors

Models might be offline, compatibility levels might be wrong, or someone might have locked a model you're trying to update. Here's how I handle that:

def apply_function_safely(dataset, workspace, function_name, expression):
    """
    Applies a UDF with proper error handling and reporting
    """
    try:
        # First check if the model is at the right compatibility level
        with connect_semantic_model(dataset=dataset, workspace=workspace) as tom:
            compat_level = tom.model.Database.CompatibilityLevel

            if compat_level < 1702:
                return {
                    'dataset': dataset,
                    'status': 'skipped',
                    'reason': f'Compatibility level {compat_level} too low'
                }

        # Now try to apply the function
        with connect_semantic_model(dataset=dataset, readonly=False, workspace=workspace) as tom:
            tom.set_user_defined_function(name=function_name, expression=expression)

        # Verify it actually worked
        check = labs.list_user_defined_functions(dataset=dataset, workspace=workspace)
        if function_name in check['Function Name'].values:
            return {'dataset': dataset, 'status': 'success'}
        else:
            return {'dataset': dataset, 'status': 'failed', 'reason': 'Function not found after creation'}

    except Exception as e:
        return {'dataset': dataset, 'status': 'failed', 'reason': str(e)}

# Use it with multiple models
results = []
for model in my_models:
    result = apply_function_safely(model, workspace, 'CalculateMargin', margin_expression)
    results.append(result)

# Print a summary
success_count = sum(1 for r in results if r['status'] == 'success')
print(f"\nResults: {success_count}/{len(results)} models updated successfully")

for result in results:
    if result['status'] != 'success':
        print(f"  Issue with {result['dataset']}: {result.get('reason', 'Unknown error')}")

This function checks compatibility first, tries to apply the UDF, verifies it worked, and gives you detailed feedback about what went wrong. Trust me, future you will thank past you for this level of error handling.

Setting Up Scheduled Syncs

One of my favorite tricks is scheduling a notebook to run weekly, keeping all my models in sync with the latest function definitions. Fabric makes this easy, you can schedule notebooks just like you'd schedule a data pipeline.

Here's a notebook I run every Monday morning:

from datetime import datetime
import pandas as pd

print(f"=== UDF Sync Job Started: {datetime.now()} ===\n")

# Configuration
MASTER_LIBRARY = 'UDF_Master_Library'
MASTER_WORKSPACE = 'Corporate_Standards'
TARGET_WORKSPACE = 'Sales_Analytics'

# Get the current function definitions
print("Loading master function library...")
master_functions = labs.list_user_defined_functions(
    dataset=MASTER_LIBRARY, 
    workspace=MASTER_WORKSPACE
)
print(f"Found {len(master_functions)} functions to sync\n")

# Get all target models
target_models = fabric.list_datasets(workspace=TARGET_WORKSPACE)
print(f"Found {len(target_models)} models to update\n")

# Sync each model
sync_results = []

for model_name in target_models['Dataset Name']:
    print(f"Processing {model_name}...")

    try:
        with connect_semantic_model(dataset=model_name, workspace=TARGET_WORKSPACE, readonly=False) as tom:
            for _, func in master_functions.iterrows():
                tom.set_user_defined_function(
                    name=func['Function Name'],
                    expression=func['Expression']
                )

        sync_results.append({'Model': model_name, 'Status': 'Success', 'Functions Synced': len(master_functions)})
        print(f"  ✓ Success\n")

    except Exception as e:
        sync_results.append({'Model': model_name, 'Status': 'Failed', 'Error': str(e)})
        print(f"  ✗ Failed: {str(e)}\n")

# Generate summary report
results_df = pd.DataFrame(sync_results)
success_count = len(results_df[results_df['Status'] == 'Success'])

print("=" * 50)
print(f"SYNC COMPLETE: {success_count}/{len(target_models)} models updated successfully")
print("=" * 50)

display(results_df)

Schedule this to run automatically, and you'll never worry about models getting out of sync again.

Example: Finance Department Rollout

Let me walk you through how I recently deployed this at a company with 15 different financial reports, each owned by different teams.

The finance director wanted three standard calculations available everywhere: Gross Margin, YoY Growth, and Budget Variance. Here's what I did:

# Define the three corporate standard functions
corporate_standards = {
    'GrossMargin': """
        (revenue : NUMERIC, cogs : NUMERIC) => 
        DIVIDE(revenue - cogs, revenue)
    """,

    'YoYGrowth': """
        (currentYear :EXPR, priorYear :EXPR) => 
        DIVIDE([currentYear] - [priorYear], [priorYear])
    """,

    'BudgetVariance': """
        (actual :EXPR, budget :EXPR) => 
        [actual] - [budget]
    """
}

# Get all finance workspace models
finance_workspace = 'Finance_Reporting'
finance_models = fabric.list_datasets(workspace=finance_workspace)

print(f"Deploying to {len(finance_models)} finance models...\n")

deployment_log = []

for model_name in finance_models['Dataset Name']:
    print(f"Deploying to: {model_name}")

    with connect_semantic_model(dataset=model_name, workspace=finance_workspace, readonly=False) as tom:
        for func_name, func_expr in corporate_standards.items():
            tom.set_user_defined_function(name=func_name, expression=func_expr)
            print(f"  → Added {func_name}")

    deployment_log.append({
        'Model': model_name,
        'Deployment Time': datetime.now(),
        'Functions Added': list(corporate_standards.keys())
    })

    print(f"  ✓ Complete\n")

# Save deployment log
log_df = pd.DataFrame(deployment_log)
log_df.to_csv('/lakehouse/default/Files/udf_deployment_log.csv', index=False)

print("All models updated! Teams can now use GrossMargin(), YoYGrowth(), and BudgetVariance() in their measures.")

After deployment, I sent an email to the finance teams showing them how to use these functions:

// Old way - calculate margin in every measure
Margin by Product = 
DIVIDE(
    SUM(Sales[Revenue]) - SUM(Sales[COGS]),
    SUM(Sales[Revenue])
)

// New way - use the standard function
Margin by Product = GrossMargin(SUM(Sales[Revenue]), SUM(Sales[COGS]))

The feedback was immediate. Not only did measures become more readable, but when the CFO decided margin should handle negative values differently, I updated one function and re-ran the deployment script. Five minutes later, all 15 reports reflected the new logic.

Testing Functions

Before rolling out a new UDF to production, test it thoroughly.

import sempy.fabric as fabric

# Create test cases
test_cases = [
    {
        'function': 'GrossMargin',
        'test_name': 'Basic margin calculation',
        'dax': 'GrossMargin(1000, 600)',
        'expected': 0.4
    },
    {
        'function': 'GrossMargin',
        'test_name': 'Zero revenue handling',
        'dax': 'GrossMargin(0, 100)',
        'expected': None  # Should return blank
    },
    {
        'function': 'YoYGrowth',
        'test_name': '50% growth',
        'dax': 'YoYGrowth(150, 100)',
        'expected': 0.5
    }
]

test_dataset = 'Function_Test_Model'

print("Running function tests...\n")

for test in test_cases:
    dax_query = f"EVALUATE {{ {test['dax']} }}"

    try:
        result = fabric.evaluate_dax(dax_string=dax_query, dataset=test_dataset)
        actual = result.iloc[0, 0]

        if test['expected'] is None:
            passed = pd.isna(actual)
        else:
            passed = abs(actual - test['expected']) < 0.001

        status = "✓ PASS" if passed else "✗ FAIL"
        print(f"{status}: {test['test_name']}")

        if not passed:
            print(f"  Expected: {test['expected']}, Got: {actual}")

    except Exception as e:
        print(f"✗ ERROR: {test['test_name']}")
        print(f"  {str(e)}")

    print()

Run this before every major deployment. Catching calculation errors in test is much better than explaining to your CFO why Q3 numbers were wrong.

Practices I've Learned

After working with this for several months, here are some things I wish I'd known earlier:

Use clear naming conventions. I prefix all my UDFs with my company abbreviation (ACME_CalculateMargin) so they're immediately recognizable and don't clash with future DAX functions.

Document everything. Add comments in your master library explaining what each function does, what parameters it takes, and any gotchas. Your future self will thank you.

Version your function library. I keep a JSON file with version numbers and change history:

import json

version_info = {
    'version': '2.1.0',
    'release_date': '2024-12-15',
    'changes': [
        'Added GrossMargin function',
        'Updated YoYGrowth to handle negative prior year values',
        'Fixed BudgetVariance decimal precision'
    ]
}

with open('/lakehouse/default/Files/udf_version.json', 'w') as f:
    json.dump(version_info, f, indent=2)

Test with real data. Don't just test with nice round numbers. Throw in nulls, negatives, zeros, and edge cases.

Set up monitoring. I have a monthly notebook that checks which models are using which functions:

# Audit function usage across all models
workspace = 'Finance_Analytics'
all_models = fabric.list_datasets(workspace=workspace)

usage_report = []

for model_name in all_models['Dataset Name']:
    functions = labs.list_user_defined_functions(dataset=model_name, workspace=workspace)

    usage_report.append({
        'Model': model_name,
        'Function Count': len(functions),
        'Functions': ', '.join(functions['Function Name'].tolist())
    })

usage_df = pd.DataFrame(usage_report)
display(usage_df)

This helps me understand adoption and spot models that might have missed updates.

Common Pitfalls to Avoid

Forgetting to upgrade compatibility level. Always check before deploying. I learned this the hard way when I tried to deploy to 30 models and 10 failed because they were at level 1600.

Using VAL when you need EXPR. If your function takes measures as parameters and those measures need to evaluate in row context, use EXPR. This one bit me when I was working with time intelligence.

Not handling errors in batch operations. Always wrap your loops in try-except blocks. One bad model shouldn't crash your entire deployment.

Ignoring performance. UDFs are just DAX expressions. If you write a slow function, it'll be slow everywhere you use it. Profile your functions with DAX Studio before deploying widely.

What About Security?

One thing to keep in mind: Object-level security doesn't automatically apply to UDFs. If you have a secured measure and create a UDF that references it, the UDF itself isn't automatically secured.

My approach is to avoid putting sensitive logic directly in UDFs. Instead, I keep sensitive measures as measures and use UDFs for general-purpose calculations that anyone can use.

In conclusion, automating DAX User-Defined Functions with Semantic Link Labs has fundamentally changed how I manage semantic models. What used to take hours of manual work—opening models, copying measures, fixing typos, hoping I didn't miss anything—now takes minutes with a Python script.

The key insight is this: Your business logic is code, and code should be reusable, version-controlled, and deployed automatically. DAX UDFs make your logic reusable. Semantic Link Labs makes deployment automatic.

Now go automate something!

Documentation

DAX UDF + Semantic Link Labs = 🚀🚀🚀