Where your calculated field is executed matters

By | June 9, 2014

I had an interesting question on one of my older blog posts Filtering results to only show top x results in Power View chart last week:

Your logic above makes perfect sense, but I can’t crack it for some reason. This is making me Sad!!!! There is obviously something missing in my logic. I have created a measure as follows

=RANKX(ALL(Table2[City]), SUM(Table2[Sales]))

The measure is called Ranking. However when I put this on my powerview chart the only value is 1. I want to sum the data by city and rank by the cities.
This is my data

Country City Sales
Germany Munich 1
Germany T 3
Germany D 2
Germany Munich 7
Germany A 8
Germany Munich 6
Germany P 9

This is actually a topic that I spend quite some time in one of my recent DAX session that I did at the PASS BA conference and TechEd. You can watch the session online right here: Solving Complex Business Problems with DAX

Lets take a closer look at this problem, Here I created the table into Excel and loaded it into the model:

image

I then created the measures as mentioned in the comment above:

=RANKX(ALL(Table1[City]), SUM(Table1[Sales]))

Now adding city and the measure in the following PivotTable gives us the following results:

image

So what is going on here? Why do we get 1 for all rows? Lets start by examining the documentation of the RANKX function: “Returns the ranking of a number in a list of numbers for each row in the table argument.” If we then look at the the three arguments (the third argument is optional, when left empty the second argument will be repeated, as is the case with the measure used for our example):

table

Any DAX expression that returns a table of data over which the expression is evaluated.

expression

Any DAX expression that returns a single scalar value. The expression is evaluated for each row of table, to generate all possible values for ranking. See the remarks section to understand the function behavior when expression evaluates to BLANK.

value

(Optional) Any DAX expression that returns a single scalar value whose rank is to be found. See the remarks section to understand the function’s behavior when value is not found in the expression.

When the value parameter is omitted, the value of expression at the current row is used instead.

Now what does that mean? Lets look at what this ranking measure does in the Pivot Table:

image

Now lets take a look at what is being evaluated for the red highlighted cell inside the engine. The RANKX will compare the measure in the RANKX parameter for each row on the table provided in the first argument to the current cell. What happens is that the measure “SUM(Table1[Sales])” gets executed twice, once to get the value for this measure in the current context (the highlighted cell above) and once for every row in the table supplied with the first argument. This is the part where you start getting unexpected results.

If you visualize what is being calculated here you get the following table, a row for each city in the table and its comparing rank:

image

As you will see is that each row returns the same results, it gives us values for the “SUM(Table1[Sales])” filtered by the values for the current cell (the outside context) instead of calculating the “SUM(Table1[Sales])” for each row of the table. What is going on here?

What is happening here is actually very similar to creating a calculated column and putting the same measure in that column:

image

Again what happens is that “SUM(Table1[Sales])” doesn’t reflect the filters for each row but only the filter from the outside context, in case of a calculated column there is no outside context as there is no Pivot Table.

If we want the row context to be applied to the calculation we need to change the measure so the row context will be promoted to filter context, to do this we need to simply wrap it around with a CALCULATE. Instead of writing:

=RANKX(ALL(Table1[City]), SUM(Table1[Sales]))

you need to write:

=RANKX(ALL(Table1[City]), CALCULATE(SUM(Table1[Sales])))

CALCULATE will do two things:

1 Evaluate the expression with the filters you applied as the arguments of CALCULATE

2 If an expression is calculated in a certain row context, it automatically adds that row context as filter

The second part is what helps us in this case as it will filter the measure with the city for the current row. Let’s take a look at what is going on with the new measure for the highlighted red cell of the Pivot Table above:

image

It might be easier to understand when thinking about the measure written this way (this is not really what happens). For the first row this is what happens, notice the filter set to City = “A”:

=RANKX(ALL(Table1[City]), CALCULATE(SUM(Table1[Sales]),[Cities]=”A”), CALCULATE(SUM(Table1[Sales])))

Now putting this measure on the Pivot Table you see we get the right results:

image

The same happens when we use this measure in the calculated column:

image

As you might notice this is quite a long and complex blog post for such a simple measure but is quite interesting to see what is going on here.  Most people will actually never encounter this problem because of a small “hack” the DAX formula engine does when you use a measure instead of a formula. When using a measure the formula engine automatically wraps a CALCULATE around the measure. So instead of:

=RANKX(ALL(Table1[City]), [Sum of Sales]))

we actually do:

=RANKX(ALL(Table1[City]), CALCULATE([Sum of Sales])))

This was you will never see the same behavior when you use a measure instead of writing the actual calculation.

Hope this gave you some insights in the inner workings of DAX.

  • Kasper, this was a very interesting and insightful article. However, I think the explanation for what gets evaluated when the Ranking measure returns 1 for all values is not completely accurate. The illustration shows the [Expression for each row] and [Sales for context] both having a value of 8 for the row with city A. I believe the value is really 36 for both for all cities, as shown in your calculated column. Since RANKX is an iterator, when SUM(Table1[Sales]) gets evaluated it is only subject to row context, but not filter context. But when you SUM a raw column in a table, as in SUM(Table1[Sales]), it ignores row context and instead sums that column for all rows in the table — in this case yielding 36. Then, as you mentioned, wrapping it in a CALCULATE converts the row context to a filter context so the 36 becomes an 8 for City A, and the rest of the cities evaluate to their specific sums, yielding a correct ranking.

    • Kasper de Jonge

      it is yielding 8 as the context for the cell that you are in is the filter that is being applied.

  • The only context in play is row context which gets ignored by the SUM function.
    Try the following DAX query on the model.
    EVALUATE
    ADDCOLUMNS (
    ALL( Table1[City] ),
    “SUM Function”, SUM( Table1[Sales] )
    )

    It returns 36 for all rows.

    In fact, even if you filter to only city A you still get 36 from SUM(Table1[Sales]).
    EVALUATE
    ADDCOLUMNS (
    FILTER( ALL( Table1[City] ),
    Table1[City] = “A”
    ),
    “SUM Function”, SUM( Table1[Sales] )
    )

  • Kasper de Jonge

    Reuven try this measure:
    =RANKX(ALL(Table1[City]), SUM(Table1[Sales]),4)
    This will show you that it will keep the outside filter from the PivotTable when calculating the measure.

    Also the queries you show are not representative of the problem and is not the same as what is happening in the Pivot Table, it acts like it is a calculated column, The filter you have in your query doesn’t get reflected in the measure, it only filters the table but does not get propagated to the measure, again this example is more like a calc. column

  • OK. Your measure with 4 as the third parameter clearly shows that I am wrong and you are right. Thanks for clarifying what is really going on. This discussion was quite helpful in understanding how RANKX operates under the hood.

    • Kasper de Jonge

      It took me some time to understand this as well, I am fortunate that I can go and check with the developers :). Thanks for the great questions.

  • This is why I like to build iteratively and hide intermediate measures.