Loading ...
Sorry, an error occurred while loading the content.

delete all but first occurence of a pattern

Expand Messages
  • Nikolaos A. Patsopoulos
    Hi all, I ve got xml files that each have 500 entries marked as .... .in each entry there are fields . Some articles
    Message 1 of 3 , Mar 31, 2007
    • 0 Attachment
      Hi all,

      I've got xml files that each have 500 entries marked as <article> ....
      </article> .in each entry there are fields <author></author>. Some articles
      have non author fields and some have more than one. What I want to do is every
      article to have only one such field, e.g. if an article has 3 <author>..
      </author> fields the two last must be dropped, if an article has none an empty
      one <author> </author> must be added. How can i do this but in every 500
      articles that are included in the same xml file? (I use VI, not gvim)

      Here follows some sample entries:

      <article>
      <AuthorList CompleteYN="Y">
      <Author ValidYN="Y">
      <LastName>Heinisch</LastName>
      <ForeName>Roberto H</ForeName>
      <Initials>RH</Initials>
      </Author>
      <Author ValidYN="Y">
      <LastName>Zanetti</LastName>
      <ForeName>Carlos R</ForeName>
      <Initials>CR</Initials>
      </Author>
      <Author ValidYN="Y">
      <LastName>Comin</LastName>
      <ForeName>Fabiano</ForeName>
      <Initials>F</Initials>
      </Author>
      <Author ValidYN="Y">
      <LastName>Fernandes</LastName>
      <ForeName>Juliano L</ForeName>
      <Initials>JL</Initials>
      </Author>
      <Author ValidYN="Y">
      <LastName>Ramires</LastName>
      <ForeName>José A</ForeName>
      <Initials>JA</Initials>
      </Author>
      <Author ValidYN="Y">
      <LastName>Serrano</LastName>
      <ForeName>Carlos V</ForeName>
      <Initials>CV</Initials>
      <Suffix>Jr</Suffix>
      </Author>
      </AuthorList>
      </article>

      <article>

      </article>

      <article>
      <AuthorList CompleteYN="Y">
      <Author ValidYN="Y">
      <LastName>Saint-Remy</LastName>
      <ForeName>Annie</ForeName>
      <Initials>A</Initials>
      </Author>
      </AuthorList>
      </article>



      The above sample should be turned into this:

      <article>
      <AuthorList CompleteYN="Y">
      <Author ValidYN="Y">
      <LastName>Heinisch</LastName>
      <ForeName>Roberto H</ForeName>
      <Initials>RH</Initials>
      </Author>
      </AuthorList>
      </article>

      <article>
      <AuthorList CompleteYN="Y">
      <Author ValidYN="Y">

      </Author>

      </AuthorList>
      </article>

      <article>
      <AuthorList CompleteYN="Y">
      <Author ValidYN="Y">
      <LastName>Saint-Remy</LastName>
      <ForeName>Annie</ForeName>
      <Initials>A</Initials>
      </Author>
      </AuthorList>
      </article>


      Thanks in advance,

      Nikos
    • Tobia
      I don t think Vim s regular expressions are the best tool for this job. I mean, XML manipulation is much easier done in XSLT:
      Message 2 of 3 , Mar 31, 2007
      • 0 Attachment
        I don't think Vim's regular expressions are the best tool for this job.
        I mean, XML manipulation is much easier done in XSLT:

        <?xml version="1.0"?>
        <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:template match="article">
        <xsl:copy>
        <xsl:copy-of select="@*"/>
        <AuthorList CompleteYN="Y">
        <Author ValidYN="Y">
        <xsl:copy-of select="AuthorList/Author[1]/*"/>
        </Author>
        </AuthorList>
        <xsl:copy-of select="node()[not(self::AuthorList)]"/>
        </xsl:copy>
        </xsl:template>
        </xsl:stylesheet>

        This does what you want in your example, assuming the source is a proper
        XML document (among other things there must be a "root tag" encompassing
        all the articles.) Invoke with "xsltproc fix-authors.xsl articles.xml"
        or with any other XSLT tool.


        To get back on-topic, I find these scripts make working with XSLT a bit
        less painful:

        xslhelper.vim http://www.vim.org/scripts/script.php?script_id=1364
        closetag.vim http://www.vim.org/scripts/script.php?script_id=13

        This, on the other hand, is on my list of "things to check", but I still
        haven't got around to checking it out:

        xml.vim http://www.vim.org/scripts/script.php?script_id=1397


        Tobia
      • Nikolaos A. Patsopoulos
        ... I m currently looking for sth that will work with VI (I use vim script files). As last resort I ll try your suggestion though. Thank you for your reply,
        Message 3 of 3 , Apr 2, 2007
        • 0 Attachment
          Tobia wrote:
          > I don't think Vim's regular expressions are the best tool for this job.
          > I mean, XML manipulation is much easier done in XSLT:
          >
          > <?xml version="1.0"?>
          > <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
          > <xsl:template match="article">
          > <xsl:copy>
          > <xsl:copy-of select="@*"/>
          > <AuthorList CompleteYN="Y">
          > <Author ValidYN="Y">
          > <xsl:copy-of select="AuthorList/Author[1]/*"/>
          > </Author>
          > </AuthorList>
          > <xsl:copy-of select="node()[not(self::AuthorList)]"/>
          > </xsl:copy>
          > </xsl:template>
          > </xsl:stylesheet>
          >
          > This does what you want in your example, assuming the source is a proper
          > XML document (among other things there must be a "root tag" encompassing
          > all the articles.) Invoke with "xsltproc fix-authors.xsl articles.xml"
          > or with any other XSLT tool.
          >
          >
          > To get back on-topic, I find these scripts make working with XSLT a bit
          > less painful:
          >
          > xslhelper.vim http://www.vim.org/scripts/script.php?script_id=1364
          > closetag.vim http://www.vim.org/scripts/script.php?script_id=13
          >
          > This, on the other hand, is on my list of "things to check", but I still
          > haven't got around to checking it out:
          >
          > xml.vim http://www.vim.org/scripts/script.php?script_id=1397
          >
          >
          > Tobia
          >
          >
          >
          I'm currently looking for sth that will work with VI (I use vim script
          files). As last resort I'll try your suggestion though.

          Thank you for your reply,

          Nikos
        Your message has been successfully submitted and would be delivered to recipients shortly.