Discussion:
current-group()[1] within xsl:for-each-group
Kevin Rodgers
2005-03-02 02:15:29 UTC
Permalink
This is a much harder problem than my previous question :-/

I've got this in a template:

<xsl:for-each-group select="mb3e:document" group-by="mb3e:fam_id">
<xsl:sort select="mb3e:prim_sort_key"/>
<xsl:variable name="first-structured-number"
select="esd:structured-number(current-group()[1])"/>
<xsl:apply-templates mode="mil" select="current-group()">
<xsl:with-param name="family-structured-number"
select="$first-structured-number"/>
<xsl:sort select="mb3e:date_list/mb3e:date[@type='PUBL']"
order="descending"/>
</xsl:apply-templates>
</xsl:for-each-group>

The idea is to process the mb3e:document elements in groups, which are
defined as having mb3e:fam_id subelements with the same content. And
within each group, to process them in reverse order of their publication
date (in YYYY-MM-DD format), in each case using the result of calling
the esd:structured-number function on the most recent document. (The
groups themselves are processed according to the mb3e:prim_sort_key
order.)

The esd:structured-number function is slow, so I only want to call it
once per group. Actually, the template that is applied within the
xsl:for-each-group instruction has to call it for each mb3e:document
element, but I want to avoid calling it more than once for any of those
elements (including the first, i.e. most recent):

<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(.)"/>

I think this works most of the time, but sometimes the
family-structured-number parameter value (i.e. $first-structured-number)
is not the value that would be returned if the esd:structured-number
function were called on the most recent mb3e:document element in the
group. For example, in once case it was the 7th most recent of a group
of 58.

Is there something inherently wrong with the code above? I know from
the output that the mb3e:document elements are grouped and sorted as I
intend. The output also indicates that the postition() = 1 test in the
mb3e:document template is working. But for some reason the
current-group()[1] expression within the xsl:for-each-group instruction
isn't always returning the element I expect.

Just in case, here's an abbreviated example showing all the referenced
elements and attributes above:

<document>
<prim_sort_key>MIL PRF 0000001</prim_sort_key>
<date_list>
<date type="PUBL">2003-06-11</date>
<date type="PUBL_MOD">2003-06-17</date>
<date type="IHS_MOD">2003-07-22</date>
</date_list>
<fam_id>AOUBEAAAAAAAAAAA</fam_id>
</document>

Thanks,
--
Kevin Rodgers


--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-***@lists.mulberrytech.com>
--~--
Michael Kay
2005-03-02 15:25:46 UTC
Permalink
Post by Kevin Rodgers
<xsl:for-each-group select="mb3e:document"
group-by="mb3e:fam_id">
<xsl:sort select="mb3e:prim_sort_key"/>
<xsl:variable name="first-structured-number"
select="esd:structured-number(current-group()[1])"/>
<xsl:apply-templates mode="mil" select="current-group()">
<xsl:with-param name="family-structured-number"
select="$first-structured-number"/>
order="descending"/>
</xsl:apply-templates>
</xsl:for-each-group>
The esd:structured-number function is slow, so I only want to call it
once per group. Actually, the template that is applied within the
xsl:for-each-group instruction has to call it for each mb3e:document
element, but I want to avoid calling it more than once for
any of those
I'm not sure what you mean by "the most recent document". current-group()[1]
selects the first item in the group currently being processed. You haven't
really shown enough of your source data for me to understand what you are
doing.
Post by Kevin Rodgers
<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(.)"/>
Where is this variable declared, and where is it used? The reference to
position() makes it highly context-sensitive.
Post by Kevin Rodgers
I think this works most of the time, but sometimes the
family-structured-number parameter value (i.e.
$first-structured-number)
is not the value that would be returned if the esd:structured-number
function were called on the most recent mb3e:document element in the
group. For example, in once case it was the 7th most recent
of a group of 58.
As I say, I don't understand what you mean by "most recent", and I don't see
anything in your code that relates to the concept. You seem to be basing the
calculation on the first item in the current group.

Michael Kay
http://www.saxonica.com/





--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-***@lists.mulberrytech.com>
--~--
Kevin Rodgers
2005-03-02 20:15:19 UTC
Permalink
Post by Michael Kay
Post by Kevin Rodgers
<xsl:for-each-group select="mb3e:document" group-by="mb3e:fam_id">
<xsl:sort select="mb3e:prim_sort_key"/>
<xsl:variable name="first-structured-number"
select="esd:structured-number(current-group()[1])"/>
<xsl:apply-templates mode="mil" select="current-group()">
<xsl:with-param name="family-structured-number"
select="$first-structured-number"/>
order="descending"/>
</xsl:apply-templates>
</xsl:for-each-group>
The esd:structured-number function is slow, so I only want to call it
once per group. Actually, the template that is applied within the
xsl:for-each-group instruction has to call it for each mb3e:document
element, but I want to avoid calling it more than once for
I'm not sure what you mean by "the most recent document". current-group()[1]
selects the first item in the group currently being processed.
This is the crux of the matter. My mistake was in thinking that
xsl:sort within xsl:apply-templates would somehow affect the order of
the element nodes within the group -- of course it can't.

Now if the unintended output had come from applying the matching
template to the group's first node (in terms of document order), I would
have immediately realized that. But for some reason the nodes in the
group are not in document order. Why is that?

Here is a more concrete question: Would it make a difference
semantically or performance-wise if I changed this

<xsl:for-each-group select="mb3e:document" group-by="mb3e:fam_id">
to this
<xsl:for-each-group select="mb3e:document" group-by="text(mb3e:fam_id)">

or is that effectively what is done when the each node's grouping key
sequence is atomized and the resulting values compared?
Post by Michael Kay
Post by Kevin Rodgers
<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(.)"/>
Where is this variable declared, and where is it used? The reference to
position() makes it highly context-sensitive.
As I tried to express, it is declared within the xsl:template that is
applied above (within xsl:for-each-group). position() seems to work as
I assumed, returning the matched node's position within the group.

But again I wonder: Would it make a difference if I changed . (dot) to
current():

<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(current())"/>

Thanks for all your help!
--
Kevin Rodgers


--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-***@lists.mulberrytech.com>
--~--
Michael Kay
2005-03-02 20:45:18 UTC
Permalink
The nodes within each group should be in "population order", that is, the
order of the original sequence, which in this case is document order.

I notice that you are doing something a little unusual, you are sorting the
groups using something other than the grouping key. The sort key (for
sorting the groups) mb3e:prim_sort_key is evaluated against the first item
in each group - if its value differs from one member of the group to another
this could be quite confusing.

The xsl:sort within the apply-templates should affect the order in which the
items within each group are processed, but it doesn't affect the result of
current-group() - at least, it shouldn't!

Michael Kay
http://www.saxonica.com/
Post by Kevin Rodgers
Now if the unintended output had come from applying the matching
template to the group's first node (in terms of document
order), I would
have immediately realized that. But for some reason the nodes in the
group are not in document order. Why is that?
As I say, I think they should be in document order as far as current-group()
is concerned; but not processed in document order, because of the xsl:sort
within apply-templates.
Post by Kevin Rodgers
Here is a more concrete question: Would it make a difference
semantically or performance-wise if I changed this
<xsl:for-each-group select="mb3e:document"
group-by="mb3e:fam_id">
to this
<xsl:for-each-group select="mb3e:document"
group-by="text(mb3e:fam_id)">
There's no text() function - you probably meant string() or data() - but
either way, you're only doing explicitly what the system is doing anyway.
Post by Kevin Rodgers
or is that effectively what is done when the each node's grouping key
sequence is atomized and the resulting values compared?
Post by Michael Kay
Post by Kevin Rodgers
<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(.)"/>
Where is this variable declared, and where is it used? The
reference to
Post by Michael Kay
position() makes it highly context-sensitive.
As I tried to express, it is declared within the xsl:template that is
applied above (within xsl:for-each-group). position() seems
to work as
I assumed, returning the matched node's position within the group.
OK, I understand now. It should return the position in the actual order of
processing, that is, the sorted order.
Post by Kevin Rodgers
But again I wonder: Would it make a difference if I changed . (dot) to
<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(current())"/>
No, in this context . and current() are synonyms.

Michael Kay
http://www.saxonica.com/



--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-***@lists.mulberrytech.com>
--~--
Kevin Rodgers
2005-03-02 23:30:14 UTC
Permalink
Post by Michael Kay
The nodes within each group should be in "population order", that is,
the order of the original sequence, which in this case is document
order.
Hmmm, I'm pretty sure that wasn't holding true. But it was only for a
fraction of a very large input document, I've fixed the bug that
resulted from my fundamental misunderstanding, and I'm trying to
complete a project, so I'm not inclined to report a possible bug to the
implementor right now.
Post by Michael Kay
I notice that you are doing something a little unusual, you are
sorting the groups using something other than the grouping key. The
sort key (for sorting the groups) mb3e:prim_sort_key is evaluated
against the first item in each group - if its value differs from one
member of the group to another this could be quite confusing.
Right. The sort key (mb3e:prim_sort_key) is constant for the examples
I've seen, and I understand that should be true across the board. But
that key is not the definitive grouping key (mb3e:fam_id), which is just
a machine-generated identifier not useful for sorting.
Post by Michael Kay
The xsl:sort within the apply-templates should affect the order in
which the items within each group are processed, but it doesn't affect
the result of current-group() - at least, it shouldn't!
At least in Saxon 8.3, it's clearly conformant! :-)
Post by Michael Kay
Post by Kevin Rodgers
Now if the unintended output had come from applying the matching
template to the group's first node (in terms of document order), I
would have immediately realized that. But for some reason the nodes
in the group are not in document order. Why is that?
As I say, I think they should be in document order as far as
current-group() is concerned; but not processed in document order,
because of the xsl:sort within apply-templates.
Things that make you go "Hmmm"...
Post by Michael Kay
Post by Kevin Rodgers
Here is a more concrete question: Would it make a difference
semantically or performance-wise if I changed this
<xsl:for-each-group select="mb3e:document" group-by="mb3e:fam_id">
to this
<xsl:for-each-group select="mb3e:document" group-by="text(mb3e:fam_id)">
or is that effectively what is done when the each node's grouping key
sequence is atomized and the resulting values compared?
There's no text() function - you probably meant string() or data() - but
either way, you're only doing explicitly what the system is doing anyway.
Good, it's nice to guess right once in a while.

...
Post by Michael Kay
Post by Kevin Rodgers
But again I wonder: Would it make a difference if I changed . (dot) to
<xsl:variable name="structured-number"
select="if (position() = 1)
then $family-structured-number
else esd:structured-number(current())"/>
No, in this context . and current() are synonyms.
Good. I just wasn't sure because the XSLT 2.0 spec says

The current function, used within an XPath expression, returns
the item that was the context item at the point where the
expression was invoked from the XSLT stylesheet. This is
referred to as the current item. For an outermost expression (an
expression not occurring within another expression), the current
item is always the same as the context item. Thus,

<xsl:value-of select="current()"/>

means the same as

<xsl:value-of select="."/>

However, within square brackets, or on the right-hand side of
the / operator, the current item is generally different from the
context item.

and even though . is not "within square brackets, or on the right-hand
side of the / operator", it is within if ... then ... else ... which I
thought might qualify as "within another expression".

Thanks again for all your help! I definitely owe you a beverage of your
choice if I ever get the opportunity.
--
Kevin


--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-***@lists.mulberrytech.com>
--~--
Loading...