Loading ...
Sorry, an error occurred while loading the content.
 

rendering PCDATA in xml documents

Expand Messages
  • Eric Chastan
    Hello, I wonder if it could be possible to write some extension around nXml in order to render code embeded in xml . Of course I think about javascript , php
    Message 1 of 18 , Jan 26, 2004
      Hello,

      I wonder if it could be possible to write some extension around nXml in
      order to render code embeded in xml . Of course I think about javascript
      , php or jsp embeded in htlm but also all other PCDATA in xlm that
      needs special indentation or special font rendering.

      Do you think that it could be possible to have some hook that let other
      parsers doing their job instead of xmltok for certain tags ?

      And then if we have these hooks how to intercat with xmltok ?

      Is there some people who have ideas , advice or time to help in this way ?

      Eric.
    • Josh Sled
      ... It s not the job of nxml to handle such things. Take a look at mmm-mode [multi-mode-mode -- http://mmm-mode.sourceforge.net/], which lets you define
      Message 2 of 18 , Feb 20, 2004
        On Mon, Jan 26, 2004 at 11:00:45AM +0100, Eric Chastan wrote:

        | I wonder if it could be possible to write some extension around nXml in
        | order to render code embeded in xml . Of course I think about javascript
        | , php or jsp embeded in htlm but also all other PCDATA in xlm that
        | needs special indentation or special font rendering.

        It's not the job of nxml to handle such things.

        Take a look at mmm-mode [multi-mode-mode --
        http://mmm-mode.sourceforge.net/%5d, which lets you define expressions
        that tell emacs to switch out of one more [nxml-mode] and into another
        [php-mode]...

        ...jsled

        --
        http://www.asynchronous.org - `a=jsled; b=asynchronous.org; echo ${a}@${b}`
      • Eric Chastan
        Hello All, Josh, I don t understand why you said that it s not the job of an xml editor to handle PCDATA. I think that it is important for such tool to render
        Message 3 of 18 , Feb 23, 2004

          Hello All,

          Josh, I don't understand why you said that it's not the job of an xml editor to handle PCDATA. I think that it is important for such tool to render all text with a friendly layout because a lot of xml files a quiet obscure to read and a good layout helps a lot.
          It is true that in my mail I spoke only about "code" but it in fact it could be every kind of text.
          I think that nxlm is really better than psgml and with the ability to render the full text in a pretty way it could be even better.

          You spoke about mmm-mode did you really try it? mmm-mode is a good tool but it has a lot of drawbacks, one of them is that mmm-mode used overlays intensively and this leads to problem when there is are a lot of embedded sections.

          Further thought about this extension let me think about something like this :
          - each time nxml finds an opening tag for an element the parser can look in a list to see if this element is associated with a function to call.
          - if it's the case nxml just let this function parsing the element. The function parses the element up to the end of the closing tag.
          There are a lot of pending question like how to handle attributes.

          I don't have enough time for the moment to work on it and also such job can't be done without the help and the agreement of James Clark himself.

          James, if you heard us what do you think about this discussion ?


          Eric.

          Josh Sled wrote:
          On Mon, Jan 26, 2004 at 11:00:45AM +0100, Eric Chastan wrote:

          |    I wonder if it could be possible to write some extension around nXml in
          |    order to render code embeded in xml . Of course I think about javascript
          |    , php or jsp embeded in htlm but also all other PCDATA   in xlm that
          |    needs special indentation or special font rendering.

          It's not the job of nxml to handle such things.

          Take a look at mmm-mode [multi-mode-mode --
          http://mmm-mode.sourceforge.net/], which lets you define expressions
          that tell emacs to switch out of one more [nxml-mode] and into another
          [php-mode]...

          ...jsled

        • James Clark
          ... If you ve got an XML file containing JavaScript, it seems like a very reasonable to want to be able to use the facilities of javascript mode to edit the
          Message 4 of 18 , Jul 25, 2004
            On Mon, 2004-02-23 at 17:08, Eric Chastan wrote:
            > Hello All,
            >
            > Josh, I don't understand why you said that it's not the job of an xml
            > editor to handle PCDATA. I think that it is important for such tool to
            > render all text with a friendly layout because a lot of xml files a
            > quiet obscure to read and a good layout helps a lot.
            > It is true that in my mail I spoke only about "code" but it in fact it
            > could be every kind of text.
            > I think that nxlm is really better than psgml and with the ability to
            > render the full text in a pretty way it could be even better.

            If you've got an XML file containing JavaScript, it seems like a very
            reasonable to want to be able to use the facilities of javascript mode
            to edit the embedded JavaScript, and similarly for other kinds of PCDATA
            which have a specialized mode.

            > You spoke about mmm-mode did you really try it? mmm-mode is a good
            > tool but it has a lot of drawbacks, one of them is that mmm-mode used
            > overlays intensively and this leads to problem when there is are a lot
            > of embedded sections.

            I haven't yet tried mmm-mode.

            I can see a couple of problems with making a general purpose mode work
            for XML:

            a) I want to be able to specify which mode is used for PCDATA at the XML
            level rather than in terms of regexes in the buffer. For example, I want
            to be able to specify that the content of an element with a specific
            namespace URI and local name should use a particular mode.

            b) I don't want to be forced to use CDATA sections. I want Emacs to
            understand that the JavaScript code isn't simply a substring of the XML
            buffer, but rather a substring of the buffer after substitution of
            character/entity references. This seems not so easy. Perhaps you could
            have a separate, temporary buffer for each PCDATA fragment, which would
            use the appropriate mode for that fragment. You would arrange you could
            edit either the XML or the temporary buffer and the temporary buffer
            would always be equal to the result of replacing character/entity
            references in the corresponding fragment of XML. Then you would have a
            command in XML mode to switch to the temporary buffer, and I guess a
            minor mode in the temporary buffer to maintain synchronization with the
            XML and to provide a command to switch back to the XML. Does mmm-mode
            deal with the escaping issue?

            What kind of UI would people like to see for dealing with embedded
            PCDATA which has its own Emacs major mode?

            James
            --
            To send me mail, replace auth-only by public in the from address.
          • drkm
            ... I think it doesn t. ... What does mean UI ? --drkm, en recherche d un stage : http://www.fgeorges.org/ipl/stage.html
            Message 5 of 18 , Jul 26, 2004
              James Clark <jjc@...> writes:

              > Does mmm-mode
              > deal with the escaping issue?

              I think it doesn't.

              > What kind of UI would people like to see for dealing with embedded
              > PCDATA which has its own Emacs major mode?

              What does mean UI ?

              --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
            • James Clark
              ... User interface. James
              Message 6 of 18 , Jul 26, 2004
                On Mon, 2004-07-26 at 20:44, drkm wrote:

                > > What kind of UI would people like to see for dealing with embedded
                > > PCDATA which has its own Emacs major mode?
                >
                > What does mean UI ?

                User interface.

                James
              • Peter Heslin
                ... For want it s worth, I ve written a package called nxml-script.el to help with this. It uses narrowing rather than mmm-mode, and it s nothing fancy, but
                Message 7 of 18 , Jul 28, 2004
                  On 2004-07-26, James Clark <jjc@...> wrote:
                  > If you've got an XML file containing JavaScript, it seems like a very
                  > reasonable to want to be able to use the facilities of javascript mode
                  > to edit the embedded JavaScript, and similarly for other kinds of PCDATA
                  > which have a specialized mode.

                  For want it's worth, I've written a package called nxml-script.el to
                  help with this. It uses narrowing rather than mmm-mode, and it's
                  nothing fancy, but it works for me.

                  You can find it here:
                  http://www.dur.ac.uk/p.j.heslin/emacs/download/nxml-script.el

                  > I haven't yet tried mmm-mode.

                  I found it to be very brittle. I tried to get mmm-mode working with
                  nxml and failed, which led to my writing nxml-script.el.

                  The current Emacs etc/TODO file says this:

                  ** Implement a clean way to use different major modes for
                  different parts of a buffer. This could be useful in editing
                  Bison input files, for instance, or other kinds of text
                  where one language is embedded in another language.

                  This implies to me that the Emacs maintainers do not regard the
                  current implementation of mmm-mode as "clean" and would like to
                  provide something better.

                  I would be wary of having nxml-mode depend on a third-party package
                  that is notoriously fiddly, and that is implicitly deprecated.

                  >
                  > I can see a couple of problems with making a general purpose mode work
                  > for XML:
                  >
                  > a) I want to be able to specify which mode is used for PCDATA at the XML
                  > level rather than in terms of regexes in the buffer. For example, I want
                  > to be able to specify that the content of an element with a specific
                  > namespace URI and local name should use a particular mode.
                  >
                  > b) I don't want to be forced to use CDATA sections. I want Emacs to
                  > understand that the JavaScript code isn't simply a substring of the XML
                  > buffer, but rather a substring of the buffer after substitution of
                  > character/entity references. This seems not so easy. Perhaps you could
                  > have a separate, temporary buffer for each PCDATA fragment, which would
                  > use the appropriate mode for that fragment. You would arrange you could
                  > edit either the XML or the temporary buffer and the temporary buffer
                  > would always be equal to the result of replacing character/entity
                  > references in the corresponding fragment of XML. Then you would have a
                  > command in XML mode to switch to the temporary buffer, and I guess a
                  > minor mode in the temporary buffer to maintain synchronization with the
                  > XML and to provide a command to switch back to the XML. Does mmm-mode
                  > deal with the escaping issue?

                  I very much doubt mmm-mode deals with escaping. Here's an idea
                  suggested by the implementation of nxml-script.el. You have a
                  function that narrows the buffer to the content of the element,
                  unescapes it, and switches to the relevant major mode. Then another
                  function escapes the narrowed text, widens to the whole buffer, and
                  switches back to nxml-mode.

                  It's not ideal, but better than temporary buffers, I think -- no
                  synchronization issues.

                  >
                  > What kind of UI would people like to see for dealing with embedded
                  > PCDATA which has its own Emacs major mode?

                  It may be that, since support for this sort of multiple major-mode
                  functionality is marginal in Emacs, the implementation will be
                  constrained by what is possible to achieve cleanly.

                  Peter
                • david.pawson@rnib.org.uk
                  ... From: James Clark What kind of UI would people like to see for dealing with embedded PCDATA which has its own Emacs major mode? How about a smart way to
                  Message 8 of 18 , Jul 28, 2004
                    -----Original Message-----
                    From: James Clark

                    What kind of UI would people like to see for dealing with
                    embedded PCDATA which has its own Emacs major mode?

                    How about a 'smart' way to change modes for an already installed mode?
                    Which would completely leave nxml-mode, run as needed for the embedded,
                    then have some way to 'return' to the nxml-mode?

                    Worst case
                    M-x jscript-mode
                    ....
                    M-x nxml-mode

                    Is that really too hard?

                    regards DaveP

                    ** snip here **

                    --
                    DISCLAIMER:

                    NOTICE: The information contained in this email and any attachments is
                    confidential and may be privileged. If you are not the intended
                    recipient you should not use, disclose, distribute or copy any of the
                    content of it or of any attachment; you are requested to notify the
                    sender immediately of your receipt of the email and then to delete it
                    and any attachments from your system.

                    RNIB endeavours to ensure that emails and any attachments generated by
                    its staff are free from viruses or other contaminants. However, it
                    cannot accept any responsibility for any such which are transmitted.
                    We therefore recommend you scan all attachments.

                    Please note that the statements and views expressed in this email and
                    any attachments are those of the author and do not necessarily represent
                    those of RNIB.

                    RNIB Registered Charity Number: 226227

                    Website: http://www.rnib.org.uk
                  • drkm
                    ... I tried a little bit MMM Mode, and it seems to be not so bad. I can t see a clean way to use different major modes without some support in 1/ Emacs Lisp
                    Message 9 of 18 , Jul 28, 2004
                      Peter Heslin <usenet@...> writes:

                      > On 2004-07-26, James Clark <jjc@...> wrote:

                      >> I haven't yet tried mmm-mode.

                      > I found it to be very brittle. I tried to get mmm-mode working with
                      > nxml and failed, which led to my writing nxml-script.el.

                      > The current Emacs etc/TODO file says this:

                      > ** Implement a clean way to use different major modes for
                      > different parts of a buffer. This could be useful in editing
                      > Bison input files, for instance, or other kinds of text
                      > where one language is embedded in another language.

                      > This implies to me that the Emacs maintainers do not regard the
                      > current implementation of mmm-mode as "clean" and would like to
                      > provide something better.

                      I tried a little bit MMM Mode, and it seems to be not so bad.

                      I can't see a clean way to use different major modes without some
                      support in 1/ Emacs Lisp and 2/ in modes in general :

                      1/ I suppose supporting multiple major modes in the same buffer
                      requires a new kind of variable. Like the buffer-local ones
                      used for now to implement modes.

                      2/ The narrow/wide mecanism require that code that doesn't use it
                      take care about a few things. Don't use (goto-char 0), but
                      (goto-char (point-min)). The same way, I think having multiple
                      major modes in the same buffer requires modifications on some
                      existing code, and how define modes.

                      I don't think MMM Mode is so bad, but it make what it can.

                      --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                    • drkm
                      ... I never use the Peter s nxml-script package, and I tried MMM Mode only a little bit. I think there are two major orientations, corresponding respectively
                      Message 10 of 18 , Jul 28, 2004
                        James Clark <jjc@...> writes:

                        > On Mon, 2004-07-26 at 20:44, drkm wrote:

                        >> > What kind of UI would people like to see for dealing with embedded
                        >> > PCDATA which has its own Emacs major mode?

                        >> What does mean UI ?

                        > User interface.

                        I never use the Peter's nxml-script package, and I tried MMM Mode
                        only a little bit. I think there are two major orientations,
                        corresponding respectively to nxml-script (after the description Peter
                        made here) and MMM Mode.

                        The first one, corresponding to nxml-script if I didn't
                        misunderstand Peter, is to switch explicitely between the two modes.
                        And eventually warrow to the submode region, or yank it to a temporary
                        buffer.

                        The second one, corresponding to MMM Mode, is to make all in place.
                        The different submode regions have different font locking,
                        indentation, syntax tables, keymaps, etc. I think this is the most
                        intuitive. You have to do nothing, and in function of your position,
                        you edit code in one or other mode. And you always view code
                        highlighted the good way.

                        But as I said in an other post, I think this is difficult (if not
                        impossible) to implement rigorusly without support in Emacs Lisp and
                        other modes. But I think MMM Mode prove it is faisible not so bad.

                        The advantage of the other orientation (as in nxml-script) is that
                        the switch points are privileged points where we can do some
                        computation (as encoding/decoding). I think this is the most simple
                        to implement.

                        In all way, we have to rely on file names and file local variables
                        to activate some specific submodes support.

                        I suppose we can do some work on nxml-script to enhance it, and in
                        parallel defining some MMM classes (MMM classes define when activate
                        some submode support in a buffer, the delimitation strings, what to
                        do, etc.).

                        So we will have two ways. One requiring switching between modes,
                        but probably more robust. The other more intuitive and usable, but
                        IMHO not so robust as can be the other way. The user will use
                        normally the second way, but can use the first one if he have some
                        trouble.

                        Peter, can you verify I didn't say errors about nxml-script ? And
                        maybe precise some points.

                        --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                      • Peter Heslin
                        ... What you said is correct, and I agree with your assessment entirely. It would be better if we had an mmm-mode style implementation. The only problem is
                        Message 11 of 18 , Jul 28, 2004
                          On 2004-07-28, drkm <darkman_spam@...> wrote:
                          > So we will have two ways. One requiring switching between modes,
                          > but probably more robust. The other more intuitive and usable, but
                          > IMHO not so robust as can be the other way. The user will use
                          > normally the second way, but can use the first one if he have some
                          > trouble.
                          >
                          > Peter, can you verify I didn't say errors about nxml-script ? And
                          > maybe precise some points.

                          What you said is correct, and I agree with your assessment entirely.
                          It would be better if we had an mmm-mode style implementation. The
                          only problem is that this functionality is not currently supported in
                          the official Emacs distribution.

                          The only real advantage of narrowing/widening the buffer and switching
                          major-modes is that it can be done pretty easily and robustly (I am
                          supposing). I agree that the UI is not as nice. That's why I said
                          that the UI may depend on what it is possible to implement cleanly.

                          Peter
                        • Vincent Lefevre
                          ... The advantage is that some form of decoding can be performed before yanking it to a temporary buffer, making the text more readable. And after editing,
                          Message 12 of 18 , Jul 28, 2004
                            On 2004-07-28 19:38:46 +0200, drkm wrote:
                            > The first one, corresponding to nxml-script if I didn't
                            > misunderstand Peter, is to switch explicitely between the two modes.
                            > And eventually warrow to the submode region, or yank it to a temporary
                            > buffer.

                            The advantage is that some form of decoding can be performed before
                            yanking it to a temporary buffer, making the text more readable. And
                            after editing, reencoding can be performed before the text is put
                            back to the original buffer. This would be a bit like po files are
                            edited.

                            --
                            Vincent Lefèvre <vincent@...> - Web: <http://www.vinc17.org/>
                            100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
                            Championnat International des Jeux Mathématiques et Logiques, etc.
                            Work: CR INRIA - computer arithmetic / SPACES project at LORIA
                          • Peter Heslin
                            ... Yes, and presumably the same kind of escaping/un-escaping could be done when widening/narrowing, if you wanted to implement it that way. Peter
                            Message 13 of 18 , Jul 28, 2004
                              On 2004-07-28, Vincent Lefevre <vincent@...> wrote:
                              > The advantage is that some form of decoding can be performed before
                              > yanking it to a temporary buffer, making the text more readable. And
                              > after editing, reencoding can be performed before the text is put
                              > back to the original buffer. This would be a bit like po files are
                              > edited.

                              Yes, and presumably the same kind of escaping/un-escaping could be
                              done when widening/narrowing, if you wanted to implement it that way.

                              Peter
                            • drkm
                              ... Yes. It s what I mean when I wrote that switching points are privileged place to perform some tasks. More generaly, I think it s also more easy to
                              Message 14 of 18 , Jul 28, 2004
                                Vincent Lefevre <vincent@...> writes:

                                > On 2004-07-28 19:38:46 +0200, drkm wrote:

                                >> The first one, corresponding to nxml-script if I didn't
                                >> misunderstand Peter, is to switch explicitely between the two modes.
                                >> And eventually warrow to the submode region, or yank it to a temporary
                                >> buffer.

                                > The advantage is that some form of decoding can be performed before
                                > yanking it to a temporary buffer, making the text more readable. And
                                > after editing, reencoding can be performed before the text is put
                                > back to the original buffer.

                                Yes. It's what I mean when I wrote that switching points are
                                privileged place to perform some tasks. More generaly, I think it's
                                also more easy to setting up the context of the submode mode
                                precisely, in a clean way.

                                > This would be a bit like po files are
                                > edited.

                                PO files. Mmm ... It's related to gettext, it isn't ? I don't
                                know how they are edited. I tried open a "test.po" file, but the mode
                                was text-mode. I didn't find any po-mode or gette* functions. What
                                do you mean, when you speak about PO files ?

                                --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                              • Vincent Lefevre
                                ... There s a po mode in Debian, provided by the gettext-el package. When a po file is edited, it is in fact marked as read-only, and the user can make a
                                Message 15 of 18 , Jul 28, 2004
                                  On 2004-07-28 23:22:53 +0200, drkm wrote:
                                  > PO files. Mmm ... It's related to gettext, it isn't ? I don't
                                  > know how they are edited. I tried open a "test.po" file, but the mode
                                  > was text-mode. I didn't find any po-mode or gette* functions. What
                                  > do you mean, when you speak about PO files ?

                                  There's a po mode in Debian, provided by the gettext-el package.
                                  When a po file is edited, it is in fact marked as read-only, and
                                  the user can make a change / new translation by typing [Return]:
                                  this opens a new Emacs window below the main one in fundamental
                                  mode. When there is a double quote (") in a message, it must be
                                  escaped with a backslash, as shown in the main window. But in
                                  the temporary buffer, the message appears decoded: the double
                                  quote isn't escaped. Ditto for tab characters (encoded as \t in
                                  the po file). When the user has finished editing the message,
                                  he types C-c C-c to return to the main window, and Emacs encodes
                                  the double quote and tab characters as expected.

                                  Something similar could be done for scripts embedded in XML, where
                                  some characters must be encoded / escaped.

                                  --
                                  Vincent Lefèvre <vincent@...> - Web: <http://www.vinc17.org/>
                                  100% validated (X)HTML - Acorn / RISC OS / ARM, free software, YP17,
                                  Championnat International des Jeux Mathématiques et Logiques, etc.
                                  Work: CR INRIA - computer arithmetic / SPACES project at LORIA
                                • drkm
                                  ... Yes, I think is a lack in Emacs. But as I said, I think adding it to Emacs would be a non trivial task, and would require modifications in the emacs/lisp
                                  Message 16 of 18 , Jul 28, 2004
                                    Peter Heslin <usenet@...> writes:

                                    > On 2004-07-28, drkm <darkman_spam@...> wrote:

                                    >> So we will have two ways. One requiring switching between modes,
                                    >> but probably more robust. The other more intuitive and usable, but
                                    >> IMHO not so robust as can be the other way. The user will use
                                    >> normally the second way, but can use the first one if he have some
                                    >> trouble.

                                    >> Peter, can you verify I didn't say errors about nxml-script ? And
                                    >> maybe precise some points.

                                    > What you said is correct, and I agree with your assessment entirely.
                                    > It would be better if we had an mmm-mode style implementation. The
                                    > only problem is that this functionality is not currently supported in
                                    > the official Emacs distribution.

                                    Yes, I think is a lack in Emacs. But as I said, I think adding it
                                    to Emacs would be a non trivial task, and would require modifications
                                    in the emacs/lisp directory ... I don't read the Emacs devel ML. I
                                    don't know if someone work on this.

                                    > The only real advantage of narrowing/widening the buffer and switching
                                    > major-modes is that it can be done pretty easily and robustly (I am
                                    > supposing).

                                    I think too.

                                    > I agree that the UI is not as nice.

                                    Well, it's not a lot of work to switch to submodes. And with a
                                    alternate binding to a function key, for example, it could be very
                                    simple to use.

                                    > That's why I said
                                    > that the UI may depend on what it is possible to implement cleanly.

                                    I think it's why we have to do two things. A clean way, like
                                    nxml-script, and writing MMM classes (because MMM Mode provide enough
                                    functionalities to be useable, I think).

                                    --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                                  • drkm
                                    ... Yes. Using a temporary buffer or narrow/wide are fundamentaly equivalent. I mean they require a trigger (an interactive function). This trigger can use a
                                    Message 17 of 18 , Jul 28, 2004
                                      Peter Heslin <usenet@...> writes:

                                      > On 2004-07-28, Vincent Lefevre <vincent@...> wrote:

                                      >> The advantage is that some form of decoding can be performed before
                                      >> yanking it to a temporary buffer, making the text more readable. And
                                      >> after editing, reencoding can be performed before the text is put
                                      >> back to the original buffer. This would be a bit like po files are
                                      >> edited.

                                      > Yes, and presumably the same kind of escaping/un-escaping could be
                                      > done when widening/narrowing, if you wanted to implement it that way.

                                      Yes. Using a temporary buffer or narrow/wide are fundamentaly
                                      equivalent. I mean they require a trigger (an interactive function).
                                      This trigger can use a temporary buffer, narrow, decoding, etc.

                                      --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                                    • drkm
                                      Vincent Lefevre writes: [about editing PO files] Ok. It s what I thought about. ... BTW, more than provide an easy way to do
                                      Message 18 of 18 , Jul 28, 2004
                                        Vincent Lefevre <vincent@...> writes:

                                        [about editing PO files]

                                        Ok. It's what I thought about.

                                        > Something similar could be done for scripts embedded in XML, where
                                        > some characters must be encoded / escaped.

                                        BTW, more than provide an easy way to do {en,de}coding, this way (as
                                        opposite to the MMM Mode way) make it clear to {en,de}code. In the
                                        MMM Mode way, it's not so clear to do or not. In MMM Mode way, you
                                        always see the entire XML document, so it may be confusing to decode
                                        "<" and co., IMHO.

                                        --drkm, en recherche d'un stage : http://www.fgeorges.org/ipl/stage.html
                                      Your message has been successfully submitted and would be delivered to recipients shortly.