Loading ...
Sorry, an error occurred while loading the content.

Re: [govtrack] XML versions of Congressional Record debates

Expand Messages
  • Bryan Helmkamp
    Joshua, Have you ever thought about opening the parser source code for collaboration? -Bryan
    Message 1 of 11 , Feb 20, 2006
    • 0 Attachment
      Joshua,

      Have you ever thought about opening the parser source code for collaboration?

      -Bryan

      On 2/20/06, Joshua Tauberer / GovTrack.us <tauberer@...> wrote:
      > Bryan Helmkamp wrote:
      > > Actually, open doing some more searching, I found what I was looking for
      > > in the "cr" directories. It appears that the newest data is a couple
      > > weeks old though. How often does GovTrack update? Are the updates
      > > found in the /data/ directory immediately?
      >
      > Uh, yeah, apparently I forgot to comment something out after making some
      > changes and so GovTrack hasn't been fetching them for a few weeks. Doh.
      >
      > They're downloading now (and appear immediately in that directory;
      > updates are daily, when I don't mess things up).
      >
      > Thanks for pointing this out!
    • Joshua Tauberer / GovTrack.us
      ... Hey, Bryan. I ve thought about it, and I m not immediately opposed to it. But, it would take some effort to tidy things up, and to set up a svn
      Message 2 of 11 , Feb 20, 2006
      • 0 Attachment
        Bryan Helmkamp wrote:
        > Have you ever thought about opening the parser source code for collaboration?

        Hey, Bryan.

        I've thought about it, and I'm not immediately opposed to it. But, it
        would take some effort to tidy things up, and to set up a svn
        repository, before I could do that. I'm also not eager to give the
        commercial services any freebies. And lastly, no one has expressed a
        real interest in contributing before.

        If you're really serious about it, I'll put that on my list of things to do.

        Is there anything in particular you'd be interested in doing/improving
        with the parsers?

        --
        - Joshua Tauberer

        http://taubz.for.net

        "Unfortunately, we're having this discussion. It's too bad,
        because guess who listens to the discussion: the enemy."
      • Bryan Helmkamp
        Joshua, One big thing I d like to do is get the parser to output into a SQL database, as opposed to just XML. Besides that, I d like to take a look at the
        Message 3 of 11 , Feb 20, 2006
        • 0 Attachment
          Joshua,

          One big thing I'd like to do is get the parser to output into a SQL
          database, as opposed to just XML.

          Besides that, I'd like to take a look at the possibility of making
          some incremental improvements... for example, perhaps it would be
          possible to keep track of who is the chair at any given time.

          Another thing I noticed is sometimes narrative actions in the CR
          source don't get included in your XML. I'd like to see about getting
          those all in there.

          If you're game for it, don't worry about tidying up the source code.
          I won't hold any GOTO statements against you. :) I'd just rather dig
          in asap.

          Let me know.

          -Bryan

          On 2/20/06, Joshua Tauberer / GovTrack.us <tauberer@...> wrote:
          > Hey, Bryan.
          >
          > I've thought about it, and I'm not immediately opposed to it. But, it
          > would take some effort to tidy things up, and to set up a svn
          > repository, before I could do that. I'm also not eager to give the
          > commercial services any freebies. And lastly, no one has expressed a
          > real interest in contributing before.
          >
          > If you're really serious about it, I'll put that on my list of things to do.
          >
          > Is there anything in particular you'd be interested in doing/improving
          > with the parsers?
          >
          > --
          > - Joshua Tauberer
          >
          > http://taubz.for.net
          >
          > "Unfortunately, we're having this discussion. It's too bad,
          > because guess who listens to the discussion: the enemy."
          >
          >
          >
          > Yahoo! Groups Links
          >
          >
          >
          >
          >
          >
          >


          --
          http://www.MyCongress.org/ -- coming soon
        • Joshua Tauberer
          ... Ok... ... I haven t even noticed that information in the record. ... Right. They often go on for pages with the text of legislation, amendments, and roll
          Message 4 of 11 , Feb 21, 2006
          • 0 Attachment
            --- In govtrack@yahoogroups.com, "Bryan Helmkamp" <helmkam1@...> wrote:
            > One big thing I'd like to do is get the parser to output into a SQL
            > database, as opposed to just XML.

            Ok...

            > Besides that, I'd like to take a look at the possibility of making
            > some incremental improvements... for example, perhaps it would be
            > possible to keep track of who is the chair at any given time.

            I haven't even noticed that information in the record.

            > Another thing I noticed is sometimes narrative actions in the CR
            > source don't get included in your XML. I'd like to see about getting
            > those all in there.

            Right. They often go on for pages with the text of legislation,
            amendments, and roll calls that quickly clutter up the main purpose of
            the files. I'm sure I could just set a flag to keep them in, although
            I wouldn't want to do that for GovTrack.

            > If you're game for it, don't worry about tidying up the source code.
            > I won't hold any GOTO statements against you. :) I'd just rather dig
            > in asap.

            Heh, well, it's a bit more than that. The person-name-to-id system is
            tied to a database which you wouldn't have access to, for instance.
            The roll call votes stuff (which you may not be interested in now) is
            tied to various data files and programs to generate the maps.

            (For some reason I didn't get your message, although Yahoo says it
            sent it, so I'm replying via Yahoo. Strange.)

            - Josh
          • Bryan Helmkamp
            Hi, Josh. ... In those cases, what I d like to do is display it like GovTrack does, but add a link to view the hidden content if you wish. ... If I had to do
            Message 5 of 11 , Mar 7 12:07 PM
            • 0 Attachment
              Hi, Josh.

              On 2/21/06, Joshua Tauberer <tauberer@...> wrote:
              > Right. They often go on for pages with the text of legislation,
              > amendments, and roll calls that quickly clutter up the main purpose of
              > the files. I'm sure I could just set a flag to keep them in, although
              > I wouldn't want to do that for GovTrack.

              In those cases, what I'd like to do is display it like GovTrack does,
              but add a link to view the hidden content if you wish.

              > Heh, well, it's a bit more than that. The person-name-to-id system is
              > tied to a database which you wouldn't have access to, for instance.
              > The roll call votes stuff (which you may not be interested in now) is
              > tied to various data files and programs to generate the maps.

              If I had to do this from scratch, I'd have to write a name-to-id
              matching system anyway, and I'm not interested in the roll call votes
              just yet.

              Basically, if you could just dump the relavent portion of source on
              me, I've got a lot of free time together and I could get it going. I
              think that would be ideal for both of us, short term.

              What do you think?

              -Bryan


              --
              http://www.MyCongress.org/ -- coming soon
            • Joshua Tauberer / GovTrack.us
              ... But maybe a very short term. The next time someone wants to use the sources, I don t want to go through the pain of re-explaining how it comes together,
              Message 6 of 11 , Mar 7 4:41 PM
              • 0 Attachment
                Bryan Helmkamp wrote:
                > Basically, if you could just dump the relavent portion of source on
                > me, I've got a lot of free time together and I could get it going. I
                > think that would be ideal for both of us, short term.

                But maybe a very short term. The next time someone wants to use the
                sources, I don't want to go through the pain of re-explaining how it
                comes together, how to set up the people db, etc. Plus there's no way
                for us to keep our versions in sync as changes are made (including
                changes to the database of people, for instance).

                I guess the thing is that opening up the scripts is a low priority for
                me (sorry), especially if it's just a short-term solution. I would much
                rather enhance my scripts so that all of the original information makes
                it into files downloadable in the data directory, and then you can just
                use that.

                --
                - Joshua Tauberer

                http://taubz.for.net

                "Unfortunately, we're having this discussion. It's too bad,
                because guess who listens to the discussion: the enemy."
              Your message has been successfully submitted and would be delivered to recipients shortly.