Loading ...
Sorry, an error occurred while loading the content.

Weird XML

Expand Messages
  • jlundigard
    Hey Josh, I m noticing some mal-formed XML in some of the bill text files. For example, on line 10750 of 111/h/h4173_rfs-eas.xml (this is a cmp of the
    Message 1 of 1 , Jun 1, 2010
    • 0 Attachment
      Hey Josh,

      I'm noticing some mal-formed XML in some of the bill text files. For example, on line 10750 of 111/h/h4173_rfs-eas.xml (this is a cmp of the financial reform bill), I'm seeing this:

      <em><inserted sequence="2628">(C) the term ‘trust bank’ means an institution described in section 2(c)(2)(D) of the Bank Holding Company Act of 1956 (</inserted><usc-reference title="12" section="1841" paragraph="c_2_D"><inserted sequence="2629"><usc-reference title="12" section="1841" paragraph="c_2_D">12 U.S.C. 1841(c)(2)(D</inserted>)</usc-reference></usc-reference>).</em>

      Obviously, the last "inserted" tag is overlapping with the "usc-reference" tags, causing our parser to break.

      Should I plan a work-around, or do you think this is something you can address?

      Thanks,
      Andy Ross
      OpenCongress
    Your message has been successfully submitted and would be delivered to recipients shortly.