Javascript XPath

Latest revision as of 13:12, 12 September 2024

[edit] General

An XPath describes a path in an XML element.
XPath items are nodes in XML tree.
Each item selects one or more XML node(s) (element, attribute, text) starting from previous item result.
An absolute path starts with slash (/). First item selector applies to root element.
A relative path (not starting with slash) behaves like XML root element was already selected. First item selector applies to it.

String constants (used with operators or in function parameters) MUST be enclosed using ' or " character. E.g. 'some_string' or "some_string".
Delimiters MUST be escaped when present inside string value (e.g. some'_''string)

String literal (used with operators). Delimiter characters MUST be repeated: 'some''_''''string'
Function parameters. Delimiter MUST be XML escaped: 'some'_''string'

Node selector

Tag. E.g. book. Reserved char * matches any element tag name
Attribute (starting with '@'. E.g. @type
Text: text()
Child text: child::text()

Predicates

Node selector can be followed by one or more predicates enclosed in square brackets: [].
Predicates add more filtering criteria.
Maximum number of predicates is 10. Parse will fail if the number of predicates exceeds this number.
Predicate type:

Index. Numeric, decimal digits

1 based index of node in parent. Allowed interval: [1..0xffffffff] (4 bytes unsigned integer excluding 0)

Index is handled by selector type, attributes indexes and child indexes are not overlapping

Keyword last() - Identifies the last node (by type: element, attribute, text) in parent
Comparison: VARIABLE<OPERATOR>VALUE

Operators:

* = Equality operator. VARIABLE is the same as VALUE

* != Inequality operator. VARIABLE is not the same as VALUE

VALUE: string literal

VARIABLEs. A VARIABLE defines the contents to be compared with VALUE

* @attr_name. Use an attribute named attr_name of the current node. Fails if there is no attribute with this name

* text(). Use current node XML text. Fails if there is not text

* Else. VARIABLE is handled as XML tag. Check for an XML child element with given tag. Use its XML text. Fails if there is no child or there is no text in in it

Variable presence

* @attr_name. Matches if current node has an attribute named attr_name

* text(). Matches if current node has a non empty xml text

* Else. Value is handled as as XML tag. Match if current node has a child name with given tag

Functions (match if true is returned)

* matches(VARIABLE,regexp[,flags]), notMatches(VARIABLE,regexp[,flags]). Regular expression (NOT) match

Matches if current node has requested VARIABLE and function returns true

Parameters:

VARIABLE: Value to check. See VARIABLEs in comparison description

regexp: String constant describing the regular expression. Delimiter MUST be XML escaped

flags: String constant with regular expression flags: i(case insensitive) b(Basic POSIX regular expression). Delimiter MUST be XML escaped

NOTE: For performance reasons it is recommended to use compiled XPath objects (already parsed).

[edit] Constructor

new XPath(str,flags])

Build an XPath from string description and parse flags.
Parameters:
str String description
flags Flags used by parser

Flags: XPath.StrictParse Enable strict path parse.

Path parse will fail in some conditions (e.g. found spaces where not expected or duplicate index in predicate)

The following will fail if this flag is set:

book/author[1][1]

book/author [1]

XPath.IgnoreEmptyResult Ignore (do not check) empty result when parsing steps.

Path parse will fail if a step would not select anything (e.g. previous step select an XML text: there is nothing after it)

The following will fail if this flag is not set:

book/text()/author - An xml text can't have a child

book/author[1][2] - An xml child can't be in first and second position

XPath.NoXmlNameCheck Do not check XML element tag or attribute name for valid XML charcaters.

Path parse will fail if this flag is not set and an invalid character is found in string

The following will fail if this flag is not set: book/&author

new XPath(strOrXPath)

Build an XPath from string description or XPath object.
Parameters:
strOrXPath String description or XPath object (copy held XPath description only)

[edit] Static Methods

escapeString(str[quot[,literal=true]])

Escape a string to be used in an XPath expression.
This function should be used when building an XPath from pieces.
Parameters:
str String to escape quot Optional string quoting (enclose) character. Allowed: ' or ". Default: "
literal True if string is going to be used as literal (e.g. in comparison), false XML string match (will be XML escaped)
Return: Escaped string

var str = "\"Literal\"<XML>";
var literal = XPath.escapeString(str);
var xml = XPath.escapeString(str,undefined,false);
// literal: """Literal""<XML>"
// xml: ""Literal"<XML>"

[edit] Methods

valid()

Check if path is valid.
Return true if path is valid, false if not (parse failed).

absolute()

Check if path is absolute.
Return true if path is an absolute one, false if not.

getPath()

Retrieve the path string description.
Return string if path is valid, null if not.

getItems([escape])

Retrieve the path items (steps).
Parameters:
escape Boolean. True to escape strings, false to return unescaped strings. Default: true
Return array of strings if path is valid, null if path is not valid.

 var x = new XPath("book[@attr='''Literal']/author[matches(text(),'<XML>')]");
 var escaped = x.getItems(); // ["book[@attr='''Literal']","author[matches(text(),'<XML>')]"]
 var unescaped = x.getItems(false); // ["book[@attr=''Literal']","author[matches(text(),'<XML>')]"]

getError()

Retrieve an object describing the path parse error.
Return object if path is not valid, undefined if path is valid.

Properties:
status Integer. Internal failure code errorItem Integer. Index of failed path item error String. Error description. May not be present

describeError()

Retrieve a string describing the path parse error.
Return string if path is not valid, undefined if path is valid.

[edit] Static Properties

FindXml, FindText, FindAttr, FindAny

Flags to be used in XML.getAnyByPath() function.

StrictParse, IgnoreEmptyResult, NoXmlNameCheck

Parser flags to be used when building an XPath from string

[edit] Examples

In code we assume a common init:

var xml = new XML("bookstore");
// Fill children ...
var path = new XPath(sample_path);

*
/*/*

Match all children of root element.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = xml.getChildren();

/bookstore/*

Match all children of root element if root element tag is bookstore.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = null;
if ("bookstore" == xml.getTag())
    arr = xml.getChildren();

book

Match all children having the tag book.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = xml.getChildren("book");

*[1]

Match first child element.
XML with XPath function:

var child = xml.getChildByPath(path);

XML function(s):

var child = xml.getChild();

*[2]

Match second child element.
XML with XPath function:

var child = xml.getChildByPath(path);

XML function(s):

var child = null;
var arr = xml.getChildren();
if (arr)
    child = arr[1];

*[last()]

Match last child element.
XML with XPath function:

var child = xml.getChildByPath(path);

XML function(s):

var child = null;
var arr = xml.getChildren();
if (arr.length)
    child = arr[arr.length - 1];

book[2][last()]

Match second child element with book tag only if its the last one.
XML with XPath function:

var child = xml.getChildByPath(path);

XML function(s):

var child = null;
var arr = xml.getChildren();
if (2 == arr.length)
    child = arr[1];

book[author]

Match all book children having an author child with non empty text.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    for (var ch of children) {
        var authors = xml.getChildren("author");
        if (authors.length) {
            for (var author of authors) {
                if (author.getText()) {
                    arr.push(ch);
                    break;
                }
           }
        }
    }
}

book/author

Match all author children of all book children.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    for (var ch of children) {
        var authors = xml.getChildren("author");
        if (authors.length)
            arr = arr.concat(authors);
    }
}

book[@category='generic'][author='Some Name'][year='2000']

Match all book children having a category=generic attribute an author child with specified text value and an year child with specified value.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    for (var ch of children) {
        if ("generic" != ch.getAttribute("category"))
            continue;
        var ok = false;
        var authors = xml.getChildren("author");
        if (authors.length) {
            for (var author of authors) {
                if ("Some Name" != author.getText())
                    continue;
                var years = author.getChildren("year");
                if (years.length) {
                    for (var year of years) {
                        if ("2000" == year.getText()) {
                            ok = true;
                            break;
                        }
                    }
                    if (ok)
                        break;
                }
            }
        }
        if (ok)
            arr.push(ch);
    }
}

book[@category]

Match all book children having a category attribute.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    for (var ch of children) {
        if (null !== ch.getAttribute("category")) {
            arr.push(ch);
            break;
        }
    }
}

book[@category='web']

Match all book children having a category attribute with web value.
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    for (var ch of children) {
        if ("web" == ch.getAttribute("category")) {
            arr.push(ch);
            break;
        }
    }
}

book[matches(@category,'^WeB$','i')]

Match all book children having a category attribute with web value (case insensitive).
XML with XPath function:

var arr = xml.getChildrenByPath(path);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
    var rex = /^WeB$/i;
    for (var ch of children) {
        if (rex.test(ch.getAttribute("category"))) {
            arr.push(ch);
            break;
        }
    }
}

book/child::text()
book/*/text()

Match the text of all children of all book children.
XML with XPath function:

var arr = [];
xml.getAnyByPath(path,arr,XPath.FindText);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
     var chs = xml.getChildren();
     if (chs.length) {
        for (var ch of chs)
            arr.push(ch.getText());
     }
}

book/author/text()

Match the text of all author children of all book children.
XML with XPath function:

var arr = [];
xml.getAnyByPath(path,arr,XPath.FindText);

XML function(s):

var arr = [];
var children = xml.getChildren("book");
if (children.length) {
     var authors = xml.getChildren("author");
     if (authors.length) {
        for (var author of authors)
            arr.push(author.getText());
     }
}

[edit] References

https://www.w3.org/TR/xpath-30/
https://www.w3schools.com/xml/xpath_intro.asp
https://en.wikipedia.org/wiki/XPath

@@ Line 15: / Line 15: @@
-Node selector:
+Node selector<br/>
 * Tag. E.g. ''book''. Reserved char '''*''' matches any element tag name
 * Attribute (starting with '@'. E.g. ''@type''
 * Text: '''text()'''
 * Child text: '''child::text()'''
+Predicates<br/>
 Node selector can be followed by one or more predicates enclosed in square brackets: ''[]''.<br/>
 Predicates add more filtering criteria.<br/>
 Maximum number of predicates is 10. Parse will fail if the number of predicates exceeds this number.<br/>
-Predicates:
+Predicate type:
 * Index. Numeric, decimal digits
 : 1 based index of node in parent. Allowed interval: [1..0xffffffff] (4 bytes unsigned integer excluding 0)
@@ Line 44: / Line 48: @@
 * Functions (match if true is returned)
 : * '''matches(VARIABLE,regexp[,flags])''', '''notMatches(VARIABLE,regexp[,flags])'''. Regular expression (NOT) match
-: Matches if current node has requested VARIABLE function returns true
+: Matches if current node has requested VARIABLE and function returns true
 : Parameters:
-: '''VARIABLE''': See VARIABLEs in comparison description
+: '''VARIABLE''': Value to check. See VARIABLEs in comparison description
 : '''regexp''': String constant describing the regular expression. Delimiter MUST be XML escaped
 : '''flags''': String constant with regular expression flags: ''i''(case insensitive) ''b''(Basic POSIX regular expression). Delimiter MUST be XML escaped
+'''NOTE: For performance reasons it is recommended to use compiled XPath objects (already parsed).'''

Javascript XPath

Latest revision as of 13:12, 12 September 2024

Contents

[edit] General

[edit] Constructor

[edit] Static Methods

[edit] Methods

[edit] Static Properties

[edit] Examples

[edit] References

Personal tools

Namespaces

Variants

Views

Actions

Search

Preface

Configuration

Administrators

Developers