elements within elements within elements within ........
- From Chris h-j, FingerPost
Apologises if this has been covered before - I missed it if it was - but I
have two small problems with the idea of infinite recursion and nesting of
elements and then fixing them if they are badly formed.
I have just written a wonderful NewsML file with 100,000 nested elements
and - unsurprisingly - it blows the parsers I have (ours and a couple of
shareware) quite nicely.
The concept of nesting is a neat idea - the concept of allowing infinite
nesting is, I feel, questionable.
I am not saying it is impossible to program an infinite number of levels -
just it seems unnecessarily complicated.
Where similar issues have arisen - for example with PostScript Rips - there
is usually a set of internal parameters which allow for limits to be changed
for specific jobs.
Should this be a documentation issue - a small note to programmers
suggesting a certain maximum level ?
Or should we be more truculent and set a hard maximum of say 20 levels ?
If so, what should that number be for the first cut everyone is doing now ?
The least we can do is add 'comment-attached' or 'programmmer-guidelines' or
whichwhat for hints on what good NewsML practise should be which could
The issue is more complicated when the problem of handling badly-formed XML
comes into play.
(FingerPost theory no 2378 states that only non-programmers say that XML
will always pass thru a parser so there will never be a case of badly-formed
We are seeing quite a lot of badly-formed XML at the moment - new programs
with bugs, programmers not interpreting things correctly.
My experience is that rejecting files because of badly-formed XML is a sure
way of losing a customer. Somehow you have to be able to fix it or convert
In most cases the problem is trivial - you do not care very much as one can
normally fudge it easily without breaking the document or its data.
But with recursive elements, if we miss an end tag, the whole structure
falls to bits. So the question here is whether there is something we can add
which can help in this - adding Duids to everything ? forcing the level
number within a duid ?
Any ideas anyone ?
PS with my 100,000 nested elements, might I claim to have created the 2nd
NewsML virus ?
I leave to Jo the prize for the 1st when he suggested at Geneva that a
Action-Delete with a Daniel's concept of Filename-as-a-path might start
wiping out great chunks of the client's database.