Loading ...
Sorry, an error occurred while loading the content.

Re: [govtrack] XML versions of Congressional Record debates

Expand Messages
  • Joshua Tauberer / GovTrack.us
    ... Hi, Bryan. Yep, they re here: http://www.govtrack.us/data/us/109/cr/ ... Yeah, that was for sure the oddest text to parse, and I find bugs with the parser
    Message 1 of 11 , Feb 20, 2006
    • 0 Attachment
      alphasigbryan wrote:
      > I was looking at the raw XML data available on the GovTrack.us site,
      > and I couldn't find XML versions of the debates from the
      > Congressional Record. Are these available? Where would they be
      > located?

      Hi, Bryan. Yep, they're here:

      http://www.govtrack.us/data/us/109/cr/

      > I've been trying to build a parser for that data for some time, and
      > it has proved to be rather difficult.

      Yeah, that was for sure the oddest text to parse, and I find bugs with
      the parser pretty often.

      --
      - Joshua Tauberer

      http://taubz.for.net

      ** Nothing Unreal Exists **
    • John Slevin
      Call your US Representative s office and ask them, they ll at least be able to tell you who can answer your question. alphasigbryan wrote:
      Message 2 of 11 , Feb 20, 2006
      • 0 Attachment
        Call your US Representative's office and ask them, they'll at least be able to tell you who can answer your question.

        alphasigbryan <helmkam1@...> wrote:
        Hello,

        I was looking at the raw XML data available on the GovTrack.us site,
        and I couldn't find XML versions of the debates from the Congressional
        Record.

        Are these available? Where would they be located?

        I've been trying to build a parser for that data for some time, and it
        has proved to be rather difficult.

        Thanks,

        -Bryan









        Yahoo! Groups Links

        <*> To visit your group on the web, go to:
        http://groups.yahoo.com/group/govtrack/

        <*> To unsubscribe from this group, send an email to:
        govtrack-unsubscribe@yahoogroups.com

        <*> Your use of Yahoo! Groups is subject to:
        http://docs.yahoo.com/info/terms/






        In Liberty,

        John P Slevin
      • Bryan Helmkamp
        Actually, open doing some more searching, I found what I was looking for in the cr directories. It appears that the newest data is a couple weeks old
        Message 3 of 11 , Feb 20, 2006
        • 0 Attachment
          Actually, open doing some more searching, I found what I was looking for in the "cr" directories.  It appears that the newest data is a couple weeks old though.  How often does GovTrack update?  Are the updates found in the /data/ directory immediately?

          -Bryan


          On 2/20/06, John Slevin <directaction@... > wrote:
          Call your US Representative's office and ask them, they'll at least be able to tell you who can answer your question.

          alphasigbryan <helmkam1@...> wrote:
          Hello,

          I was looking at the raw XML data available on the GovTrack.us site,
          and I couldn't find XML versions of the debates from the Congressional
          Record.

          Are these available? Where would they be located?

          I've been trying to build a parser for that data for some time, and it
          has proved to be rather difficult.

          Thanks,

          -Bryan









          Yahoo! Groups Links

          <*> To visit your group on the web, go to:
          http://groups.yahoo.com/group/govtrack/

          <*> To unsubscribe from this group, send an email to:
          govtrack-unsubscribe@yahoogroups.com

          <*> Your use of Yahoo! Groups is subject to:
          http://docs.yahoo.com/info/terms/






          In Liberty,

          John P Slevin


          SPONSORED LINKS
          United states United state army United state flag


          YAHOO! GROUPS LINKS






          --
          http://www.MyCongress.org/ -- coming soon



          --
          http://www.MyCongress.org/ -- coming soon
        • Joshua Tauberer / GovTrack.us
          ... Uh, yeah, apparently I forgot to comment something out after making some changes and so GovTrack hasn t been fetching them for a few weeks. Doh. They re
          Message 4 of 11 , Feb 20, 2006
          • 0 Attachment
            Bryan Helmkamp wrote:
            > Actually, open doing some more searching, I found what I was looking for
            > in the "cr" directories. It appears that the newest data is a couple
            > weeks old though. How often does GovTrack update? Are the updates
            > found in the /data/ directory immediately?

            Uh, yeah, apparently I forgot to comment something out after making some
            changes and so GovTrack hasn't been fetching them for a few weeks. Doh.

            They're downloading now (and appear immediately in that directory;
            updates are daily, when I don't mess things up).

            Thanks for pointing this out!

            --
            - Joshua Tauberer

            http://taubz.for.net

            "Unfortunately, we're having this discussion. It's too bad,
            because guess who listens to the discussion: the enemy."
          • Bryan Helmkamp
            Joshua, Have you ever thought about opening the parser source code for collaboration? -Bryan
            Message 5 of 11 , Feb 20, 2006
            • 0 Attachment
              Joshua,

              Have you ever thought about opening the parser source code for collaboration?

              -Bryan

              On 2/20/06, Joshua Tauberer / GovTrack.us <tauberer@...> wrote:
              > Bryan Helmkamp wrote:
              > > Actually, open doing some more searching, I found what I was looking for
              > > in the "cr" directories. It appears that the newest data is a couple
              > > weeks old though. How often does GovTrack update? Are the updates
              > > found in the /data/ directory immediately?
              >
              > Uh, yeah, apparently I forgot to comment something out after making some
              > changes and so GovTrack hasn't been fetching them for a few weeks. Doh.
              >
              > They're downloading now (and appear immediately in that directory;
              > updates are daily, when I don't mess things up).
              >
              > Thanks for pointing this out!
            • Joshua Tauberer / GovTrack.us
              ... Hey, Bryan. I ve thought about it, and I m not immediately opposed to it. But, it would take some effort to tidy things up, and to set up a svn
              Message 6 of 11 , Feb 20, 2006
              • 0 Attachment
                Bryan Helmkamp wrote:
                > Have you ever thought about opening the parser source code for collaboration?

                Hey, Bryan.

                I've thought about it, and I'm not immediately opposed to it. But, it
                would take some effort to tidy things up, and to set up a svn
                repository, before I could do that. I'm also not eager to give the
                commercial services any freebies. And lastly, no one has expressed a
                real interest in contributing before.

                If you're really serious about it, I'll put that on my list of things to do.

                Is there anything in particular you'd be interested in doing/improving
                with the parsers?

                --
                - Joshua Tauberer

                http://taubz.for.net

                "Unfortunately, we're having this discussion. It's too bad,
                because guess who listens to the discussion: the enemy."
              • Bryan Helmkamp
                Joshua, One big thing I d like to do is get the parser to output into a SQL database, as opposed to just XML. Besides that, I d like to take a look at the
                Message 7 of 11 , Feb 20, 2006
                • 0 Attachment
                  Joshua,

                  One big thing I'd like to do is get the parser to output into a SQL
                  database, as opposed to just XML.

                  Besides that, I'd like to take a look at the possibility of making
                  some incremental improvements... for example, perhaps it would be
                  possible to keep track of who is the chair at any given time.

                  Another thing I noticed is sometimes narrative actions in the CR
                  source don't get included in your XML. I'd like to see about getting
                  those all in there.

                  If you're game for it, don't worry about tidying up the source code.
                  I won't hold any GOTO statements against you. :) I'd just rather dig
                  in asap.

                  Let me know.

                  -Bryan

                  On 2/20/06, Joshua Tauberer / GovTrack.us <tauberer@...> wrote:
                  > Hey, Bryan.
                  >
                  > I've thought about it, and I'm not immediately opposed to it. But, it
                  > would take some effort to tidy things up, and to set up a svn
                  > repository, before I could do that. I'm also not eager to give the
                  > commercial services any freebies. And lastly, no one has expressed a
                  > real interest in contributing before.
                  >
                  > If you're really serious about it, I'll put that on my list of things to do.
                  >
                  > Is there anything in particular you'd be interested in doing/improving
                  > with the parsers?
                  >
                  > --
                  > - Joshua Tauberer
                  >
                  > http://taubz.for.net
                  >
                  > "Unfortunately, we're having this discussion. It's too bad,
                  > because guess who listens to the discussion: the enemy."
                  >
                  >
                  >
                  > Yahoo! Groups Links
                  >
                  >
                  >
                  >
                  >
                  >
                  >


                  --
                  http://www.MyCongress.org/ -- coming soon
                • Joshua Tauberer
                  ... Ok... ... I haven t even noticed that information in the record. ... Right. They often go on for pages with the text of legislation, amendments, and roll
                  Message 8 of 11 , Feb 21, 2006
                  • 0 Attachment
                    --- In govtrack@yahoogroups.com, "Bryan Helmkamp" <helmkam1@...> wrote:
                    > One big thing I'd like to do is get the parser to output into a SQL
                    > database, as opposed to just XML.

                    Ok...

                    > Besides that, I'd like to take a look at the possibility of making
                    > some incremental improvements... for example, perhaps it would be
                    > possible to keep track of who is the chair at any given time.

                    I haven't even noticed that information in the record.

                    > Another thing I noticed is sometimes narrative actions in the CR
                    > source don't get included in your XML. I'd like to see about getting
                    > those all in there.

                    Right. They often go on for pages with the text of legislation,
                    amendments, and roll calls that quickly clutter up the main purpose of
                    the files. I'm sure I could just set a flag to keep them in, although
                    I wouldn't want to do that for GovTrack.

                    > If you're game for it, don't worry about tidying up the source code.
                    > I won't hold any GOTO statements against you. :) I'd just rather dig
                    > in asap.

                    Heh, well, it's a bit more than that. The person-name-to-id system is
                    tied to a database which you wouldn't have access to, for instance.
                    The roll call votes stuff (which you may not be interested in now) is
                    tied to various data files and programs to generate the maps.

                    (For some reason I didn't get your message, although Yahoo says it
                    sent it, so I'm replying via Yahoo. Strange.)

                    - Josh
                  • Bryan Helmkamp
                    Hi, Josh. ... In those cases, what I d like to do is display it like GovTrack does, but add a link to view the hidden content if you wish. ... If I had to do
                    Message 9 of 11 , Mar 7, 2006
                    • 0 Attachment
                      Hi, Josh.

                      On 2/21/06, Joshua Tauberer <tauberer@...> wrote:
                      > Right. They often go on for pages with the text of legislation,
                      > amendments, and roll calls that quickly clutter up the main purpose of
                      > the files. I'm sure I could just set a flag to keep them in, although
                      > I wouldn't want to do that for GovTrack.

                      In those cases, what I'd like to do is display it like GovTrack does,
                      but add a link to view the hidden content if you wish.

                      > Heh, well, it's a bit more than that. The person-name-to-id system is
                      > tied to a database which you wouldn't have access to, for instance.
                      > The roll call votes stuff (which you may not be interested in now) is
                      > tied to various data files and programs to generate the maps.

                      If I had to do this from scratch, I'd have to write a name-to-id
                      matching system anyway, and I'm not interested in the roll call votes
                      just yet.

                      Basically, if you could just dump the relavent portion of source on
                      me, I've got a lot of free time together and I could get it going. I
                      think that would be ideal for both of us, short term.

                      What do you think?

                      -Bryan


                      --
                      http://www.MyCongress.org/ -- coming soon
                    • Joshua Tauberer / GovTrack.us
                      ... But maybe a very short term. The next time someone wants to use the sources, I don t want to go through the pain of re-explaining how it comes together,
                      Message 10 of 11 , Mar 7, 2006
                      • 0 Attachment
                        Bryan Helmkamp wrote:
                        > Basically, if you could just dump the relavent portion of source on
                        > me, I've got a lot of free time together and I could get it going. I
                        > think that would be ideal for both of us, short term.

                        But maybe a very short term. The next time someone wants to use the
                        sources, I don't want to go through the pain of re-explaining how it
                        comes together, how to set up the people db, etc. Plus there's no way
                        for us to keep our versions in sync as changes are made (including
                        changes to the database of people, for instance).

                        I guess the thing is that opening up the scripts is a low priority for
                        me (sorry), especially if it's just a short-term solution. I would much
                        rather enhance my scripts so that all of the original information makes
                        it into files downloadable in the data directory, and then you can just
                        use that.

                        --
                        - Joshua Tauberer

                        http://taubz.for.net

                        "Unfortunately, we're having this discussion. It's too bad,
                        because guess who listens to the discussion: the enemy."
                      Your message has been successfully submitted and would be delivered to recipients shortly.