Loading ...
Sorry, an error occurred while loading the content.

Merging html files

Expand Messages
  • John Shotsky
    I have a list of files that I want to merge into a single file, in the order in which I want to merge them. They are a group of html files, and may be Unicode,
    Message 1 of 7 , Aug 3, 2014
    • 0 Attachment

      I have a list of files that I want to merge into a single file, in the order in which I want to merge them. They are a group of html files, and may be Unicode, so I do not want to open them or save them until later.

      I would like to set a variable %Index% that is equal to the number of items in my list.

      I would like to set the file names as an array of %FileList%, and set up a loop that processes through my list and appends each file to a default, empty file (raw.html).

      I am not quite sure how to go from my list, which was created by my tools, to the single merged file.

       

      A test suite of files would be two html files named File1.html and File2.html, with the words <One> and <Two> in them.

      The list to work from would be a plain .txt file with the following contents:

      File1.html

      File2.html

      The working path is known to the system, but for testing it can be hardcoded to, say, C:\work. All three files would be in this location to start, and at the end there will be 4 files, the new one of which contains the contents of the File1 and File2 files.

      I'm probably overlooking something ridiculously easy, but I can't seem to figure out how to set the count, and make the list parsable. Do I need to add some delimiters to separate them, or ??? There is already a CR at the end of each line.

      The command I want to use to merge them is:

      ^!AppendToFile "^%RawFile%" ^$GetFileText(^%FileList^%Index%%)$

      And finally, once the merge is complete, I want to use my list again as a source to delete the individual html files, as if they were never there, and the merged file is the replacement for them.

      Any ideas out there?

      Thanks!
      John Shotsky


       

    • Axel Berger
      ... The easy way is to make a Batch file with one line: copy File1.html+File2.html raw.html and run that. But what do you mean by html file? If it s just bits
      Message 2 of 7 , Aug 3, 2014
      • 0 Attachment
        "'John Shotsky' jshotsky@... [ntb-clips]" wrote:
        > I would like to set the file names as an array of %FileList%, and set up
        > a loop that processes through my list and appends each file to a default,
        > empty file (raw.html).

        The easy way is to make a Batch file with one line:

        copy File1.html+File2.html raw.html

        and run that. But what do you mean by html file? If it's just bits of code,
        you're fine. A true HTML file must have a header, so obviously you have to
        cut off the headers of all but the first of them. You also must not have
        incompatible structures like multiple identical IDs.

        This too could be automated but clashes with your other wish not to open
        and edit them before merging.

        > and may be Unicode

        Whatever it is, they had better all be coded the same or the merged one
        will be a mess.

        Axel
      • John Shotsky
        They will all be similar, and eventually all the html will be removed anyway. All I am after is the actual text, inside a very, very basic html5 structure. I
        Message 3 of 7 , Aug 3, 2014
        • 0 Attachment

          They will all be similar, and eventually all the html will be removed anyway. All I am after is the actual text, inside a very, very basic html5 structure. I know I have to remove all the individual html headers, etc, but what I do is make a single default header and delete them all, then add the new one at the top and finish the bottom. Voila! A single html5 file with the contents of the other files. After the merge is complete, my Unicode clips are run to get all characters into the ANSI range. Character codes are used for upper characters, so there will be no lost characters in NoteTab. I already have all that code working, it is just getting this list to merge that is puzzling me now. I did figure out how to get the count, and I think I can just store the text contents of that file in a 'FileList' variable to get that. Working on that now.

           

          Regards,
          John
          RecipeTools Web Site: http://recipetools.gotdns.com/recipetools/
          John's Mags Yahoo Group:  http://groups.yahoo.com/group/johnsmags/

           

          From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com]
          Sent: Sunday, August 03, 2014 14:58
          To: ntb-clips@yahoogroups.com
          Subject: Re: [Clip] Merging html files

           

           

          "'John Shotsky' jshotsky@... [ntb-clips]" wrote:

          > I would like to set the file names as an array of %FileList%, and set up
          > a loop that processes through my list and appends each file to a default,
          > empty file (raw.html).

          The easy way is to make a Batch file with one line:

          copy File1.html+File2.html raw.html

          and run that. But what do you mean by html file? If it's just bits of code,
          you're fine. A true HTML file must have a header, so obviously you have to
          cut off the headers of all but the first of them. You also must not have
          incompatible structures like multiple identical IDs.

          This too could be automated but clashes with your other wish not to open
          and edit them before merging.

          > and may be Unicode

          Whatever it is, they had better all be coded the same or the merged one
          will be a mess.

          Axel

        • Axel Berger
          ... Sounds good. ... Ah well, I gave you the easy way to do that. Axel
          Message 4 of 7 , Aug 3, 2014
          • 0 Attachment
            "'John Shotsky' jshotsky@... [ntb-clips]" wrote:
            > I already have all that code working,

            Sounds good.

            > it is just getting this list to merge that is puzzling me now.

            Ah well, I gave you the easy way to do that.

            Axel
          • flo.gehrke
            ... I just consider the merging and disregard those HTML-issues. For this, a rough draft could be this: Start from an open list like... File1.html File2.html
            Message 5 of 7 , Aug 3, 2014
            • 0 Attachment
              > I would like to set the file names as an array of %FileList%,
              > and set up a loop that processes through my list and appends each
              > file to a default, empty file (raw.html).
              > I am not quite sure how to go from my list, which was created by my
              > tools, to the single merged file.

              I just consider the merging and disregard those HTML-issues. For this, a rough draft could be this:

              Start from an open list like...

              File1.html
              File2.html
              File3.txt

              The list must not contain any empty line.

              The following clip will prompt you to enter the folder which contains the source files and also serves as the target folder of the merged file.

              I wouldn't use ^!AppendToFile but would insert all source files into a new document that will be saved as the merged file.

              ^!Set %Path%=^?{(T=D)Enter directory}
              ^!SetListDelimiter ^%NL%
              ^!SetArray %FileList%=^$GetText$
              ^!Set %Nr%=1

              :Loop_1
              ^!Set %Contents%=^$GetFileText(^%Path%^%FileList^%Nr%%)$
              ^!Append %RawFile%=^%Contents%^%NL%
              ^!Inc %Nr%
              ^!If ^%Nr%<=^%FileList0% Loop_1
              ^!Toolbar New Document
              ^!InsertText ^%RawFile%
              ^!Save as ^%Path%RawFile.txt
              ^!Set %Nr%=1

              :Loop_2
              ;  Delete source files
              ^!RecycleFile ^%Path%^%FileList^%Nr%%
              ^!Inc %Nr%
              ^!If ^%Nr%<=^%FileList0% Loop_2
              ^!ClearVariables

              > I would like to set a variable %Index% that is equal to
              > the number of items in my list

              ^!Set %Index%=^%FileList0%

              Regards,
              Flo
            • John Shotsky
              That would work for text files, but not for Unicode files. (GetText is ignorant of Unicode) The Unicode has to be converted to ANSI before the file is ever
              Message 6 of 7 , Aug 3, 2014
              • 0 Attachment

                That would work for text files, but not for Unicode files. (GetText is ignorant of Unicode) The Unicode has to be converted to ANSI before the file is ever saved by NT. That is why to append them all together without opening or saving of files. Notetab will convert Unicode characters, like single fraction 1/8, 1/3 to question marks, and save the question marks - a dead end. A wizard is not needed, because the program builds its set of files to merge automatically. No user intervention.

                Think of it this way: We are unzipping a zip file that has all the files of interest, as well as a list of what is in the zip file. I want to use that list of files to merge all the files that were extracted from the zip file. So I have a list, I have a count, I have an index that is to be incremented for each file appended, I have all the files, but I can't figure out how to make it process only the one file that is represented by the filelist+index. (The third file in the list, say.)

                I do this kind of thing already using wizards, where the user selects which files they want to merge, but I can't make this work the same way for some reason. It gives errors when opening one of the indexed files because it appears to be trying to open all the files in the list at once. Or something. In any case, the index doesn't seem to tell it which file to open, but says 'not found', as if it is looking for a file named '1'.

                 

                Regards,
                John
                RecipeTools Web Site: http://recipetools.gotdns.com/recipetools/
                John's Mags Yahoo Group:  http://groups.yahoo.com/group/johnsmags/

                 

                From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com]
                Sent: Sunday, August 03, 2014 16:15
                To: ntb-clips@yahoogroups.com
                Subject: [Clip] Re: Merging html files

                 

                 

                > I would like to set the file names as an array of %FileList%,

                > and set up a loop that processes through my list and appends each
                > file to a default, empty file (raw.html).
                > I am not quite sure how to go from my list, which was created by my
                > tools, to the single merged file.

                I just consider the merging and disregard those HTML-issues. For this, a rough draft could be this:

                Start from an open list like...

                File1.html
                File2.html
                File3.txt

                The list must not contain any empty line.

                The following clip will prompt you to enter the folder which contains the source files and also serves as the target folder of the merged file.

                I wouldn't use ^!AppendToFile but would insert all source files into a new document that will be saved as the merged file.

                ^!Set %Path%=^?{(T=D)Enter directory}
                ^!SetListDelimiter ^%NL%
                ^!SetArray %FileList%=^$GetText$
                ^!Set %Nr%=1

                :Loop_1
                ^!Set %Contents%=^$GetFileText(^%Path%^%FileList^%Nr%%)$
                ^!Append %RawFile%=^%Contents%^%NL%
                ^!Inc %Nr%
                ^!If ^%Nr%<=^%FileList0% Loop_1
                ^!Toolbar New Document
                ^!InsertText ^%RawFile%
                ^!Save as ^%Path%RawFile.txt
                ^!Set %Nr%=1

                :Loop_2
                ;  Delete source files
                ^!RecycleFile ^%Path%^%FileList^%Nr%%
                ^!Inc %Nr%
                ^!If ^%Nr%<=^%FileList0% Loop_2
                ^!ClearVariables

                > I would like to set a variable %Index% that is equal to
                > the number of items in my list

                ^!Set %Index%=^%FileList0%

                Regards,
                Flo

              • John Shotsky
                I have solved this problem. For anyone curious about the solution, here it is: I make two sets of the filelist. No index, no count, etc. Jump to top of
                Message 7 of 7 , Aug 3, 2014
                • 0 Attachment

                  I have solved this problem. For anyone curious about the solution, here it is:

                  I make two sets of the filelist.

                  No index, no count, etc.

                  Jump to top of filelist and find a string, which is a file name. If not found, done.

                  Set %File% to 'GetSelection'.

                  Delete selection. There is now one less filename in the list.

                  Open ^%File% as Unicode

                  Perform all Unicode conversions

                  Save as ANSI

                  Close it.

                  Loop back for next file until the list is empty.

                  At this point all files are ANSI, so any method will work to merge them. I'm just repeating the above routine with the second list, appending normally. All the NoteTab functions work on ANSI, so it doesn't really matter which method I use here.

                  Tested on about 50 files, and all worked as desired.

                  Thanks for the suggestions!

                  Regards,
                  John
                  RecipeTools Web Site: http://recipetools.gotdns.com/recipetools/
                  John's Mags Yahoo Group:  http://groups.yahoo.com/group/johnsmags/

                   

                  From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com]
                  Sent: Sunday, August 03, 2014 14:25
                  To: ntb-clips@yahoogroups.com
                  Subject: [Clip] Merging html files

                   

                   

                  I have a list of files that I want to merge into a single file, in the order in which I want to merge them. They are a group of html files, and may be Unicode, so I do not want to open them or save them until later.

                  I would like to set a variable %Index% that is equal to the number of items in my list.

                  I would like to set the file names as an array of %FileList%, and set up a loop that processes through my list and appends each file to a default, empty file (raw.html).

                  I am not quite sure how to go from my list, which was created by my tools, to the single merged file.

                   

                  A test suite of files would be two html files named File1.html and File2.html, with the words <One> and <Two> in them.

                  The list to work from would be a plain .txt file with the following contents:

                  File1.html

                  File2.html

                  The working path is known to the system, but for testing it can be hardcoded to, say, C:\work. All three files would be in this location to start, and at the end there will be 4 files, the new one of which contains the contents of the File1 and File2 files.

                  I'm probably overlooking something ridiculously easy, but I can't seem to figure out how to set the count, and make the list parsable. Do I need to add some delimiters to separate them, or ??? There is already a CR at the end of each line.

                  The command I want to use to merge them is:

                  ^!AppendToFile "^%RawFile%" ^$GetFileText(^%FileList^%Index%%)$

                  And finally, once the merge is complete, I want to use my list again as a source to delete the individual html files, as if they were never there, and the merged file is the replacement for them.

                  Any ideas out there?

                  Thanks!
                  John Shotsky


                   

                Your message has been successfully submitted and would be delivered to recipients shortly.