Thursday, 29 August 2013

XPath select all but not self::strong and self::strong/following-sibling::text()

XPath select all but not self::strong and
self::strong/following-sibling::text()

So I have following example html to parse.
<div>
<strong>Title:</strong>
Sub Editor at NEWS ABC
<strong>Name:</strong>
John
<strong>Where:</strong>
Everywhere
<strong>When:</strong>
Anytime
<strong>Everything can go down there..</strong>
Lorem Ipsum blah blah blah....
</div>
I want to extract this whole div except I don't want Title and Where and
When heading with their following values.
I have tested following XPaths so far.
a) Without following sibling (1: don't work. 2: works)
1. //div/node()[not(strong[contains(text(), "Title")])]
2. //div/node()[not(self::strong and contains(text(), "Title"))]
a) With following sibling (1: don't work. 2: don't work)
1. //div/node()[not(strong[contains(text(), "Title")]) and
not(strong[contains(text(), "Title")]/following-sibling::text())]
2. //div/node()[not(self::strong and contains(text(), "Title") and
following-sibling::text())]
How to achieve what I am after?

No comments:

Post a Comment