Loading ...
Sorry, an error occurred while loading the content.
 

how to extract metadata from files(pdf, jpg, docx) using fiwalk?

Expand Messages
  • cool_akp123
    plz let me know the process of extracting metadata from any file using command line tool fiwalk. I have installed this tool in ubuntu. I have used fiwalk
    Message 1 of 6 , Jul 12, 2011
      plz let me know the process of extracting metadata from any file using command line tool fiwalk. I have installed this tool in ubuntu. I have used fiwalk command with the options provided in it. but the next thing i want to do is to extract metadata from the files. Plz guide me.
    • Simson Garfinkel
      You need to run fiwalk with the -c option to specify a config file. The config file specifies extraction programs that implement the DGI protocol. An example
      Message 2 of 6 , Jul 12, 2011
        You need to run fiwalk with the "-c" option to specify a config file. The config file specifies extraction programs that implement the DGI protocol. An example config files is provided with the distribution.


        On Jul 12, 2011, at 5:55 AM, cool_akp123 wrote:

        > plz let me know the process of extracting metadata from any file using command line tool fiwalk. I have installed this tool in ubuntu. I have used fiwalk command with the options provided in it. but the next thing i want to do is to extract metadata from the files. Plz guide me.
        >
        >



        [Non-text portions of this message have been removed]
      • cool_akp123
        thankyou sir for your valuable reply and it helped me to ged metadata of docx and odt documents but i am getting error while trying to extract metadata through
        Message 3 of 6 , Jul 12, 2011
          thankyou sir for your valuable reply and it helped me to ged metadata of docx and odt documents but i am getting error while trying to extract metadata through libextractor or jpeg_extract which uses java.
          I am getting the following error:
          filename: clerk-rec-Challan-17jul10.pdf
          partition: 1
          id: 5
          name_type: r
          filesize: 44059
          alloc: 1
          used: 1
          inode: 3
          meta_type: 1
          mode: 0
          nlink: 1
          uid: 0
          gid: 0
          crtime: 1281923120
          crtime_txt: 2010-08-16 01:45:20
          MD5: 680da7cd1907a60807894a13ecbcc249
          SHA1: 824d8ccc32470c28701958e39b65de64f96dc00e
          # plugin_process
          java.io.IOException: Cannot run program "extract": java.io.IOException: error=2, No such file or directory
          at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
          at Libextract_plugin.process(Libextract_plugin.java:38)
          at Libextract_plugin.main(Libextract_plugin.java:61)
          Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
          at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
          at java.lang.ProcessImpl.start(ProcessImpl.java:65)
          at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
          ... 2 more
        • The Dog's Bollix
          Hi, Is there a way to have bulk_extractor only search for emails, as opposed to all of the other data it searches for? As it is time-consuming to run the
          Message 4 of 6 , Jul 17, 2011
            Hi,

            Is there a way to have bulk_extractor only search for emails, as opposed to all of the other data it searches for?

            As it is time-consuming to run the program, it would be nice to speed it up by limiting what it's processing.

            Any way to do this?

            TIA,

            Tony.


            [Non-text portions of this message have been removed]
          • Lehr, John
            You just need to add the -E parameter along with the scanner you want to run. This turns off all scanners except the scanner you pass after the -E
            Message 5 of 6 , Jul 17, 2011
              You just need to add the '-E' parameter along with the scanner you want to run. This turns off all scanners except the scanner you pass after the "-E" parameter, as in:

              bulk_extractor -E email -o out_directory image.dd

              ---------------------------------
              John Lehr
              Evidence Technician
              San Luis Obispo Police Department
              ________________________________________
              From: linux_forensics@yahoogroups.com [linux_forensics@yahoogroups.com] On Behalf Of The Dog's Bollix [isxpro@...]
              Sent: Sunday, July 17, 2011 10:07 AM
              To: linux_forensics@yahoogroups.com
              Subject: [linux_forensics] Limiting bulk_extractor output

              Hi,

              Is there a way to have bulk_extractor only search for emails, as opposed to all of the other data it searches for?

              As it is time-consuming to run the program, it would be nice to speed it up by limiting what it's processing.

              Any way to do this?

              TIA,

              Tony.

              [Non-text portions of this message have been removed]
            • Simson Garfinkel
              The one danger of running -E email is that you will not extract email addresses from compressed data. Therefore I recommend: -E email -e zip -e gzip -e hiber
              Message 6 of 6 , Jul 17, 2011
                The one danger of running -E email is that you will not extract email addresses from compressed data. Therefore I recommend:

                -E email -e zip -e gzip -e hiber -e pdf

                Gosh, I should have a way for simply enabling all of the recursive scanners. Hm...

                Simson


                On Jul 17, 2011, at 4:00 PM, Lehr, John wrote:

                > You just need to add the '-E' parameter along with the scanner you want to run. This turns off all scanners except the scanner you pass after the "-E" parameter, as in:
                >
                > bulk_extractor -E email -o out_directory image.dd
                >
                > ---------------------------------
                > John Lehr
                > Evidence Technician
                > San Luis Obispo Police Department
                > ________________________________________
                > From: linux_forensics@yahoogroups.com [linux_forensics@yahoogroups.com] On Behalf Of The Dog's Bollix [isxpro@...]
                > Sent: Sunday, July 17, 2011 10:07 AM
                > To: linux_forensics@yahoogroups.com
                > Subject: [linux_forensics] Limiting bulk_extractor output
                >
                > Hi,
                >
                > Is there a way to have bulk_extractor only search for emails, as opposed to all of the other data it searches for?
                >
                > As it is time-consuming to run the program, it would be nice to speed it up by limiting what it's processing.
                >
                > Any way to do this?
                >
                > TIA,
                >
                > Tony.
                >
                > [Non-text portions of this message have been removed]
                >
                >
                >
                >
                >
                > ------------------------------------
                >
                > Yahoo! Groups Links
                >
                >
                >
              Your message has been successfully submitted and would be delivered to recipients shortly.