Yeah, the best thing I could think of for the title parser would be to
look at the "Latest Title" element, and if that title appears nowhere
in the list of titles, assume it's the latest short or official title
for the bill. Not ideal.
On Fri, Dec 17, 2010 at 6:38 PM, Josh Tauberer <tauberer@...> wrote:
> I'm not sure what was going on with that bill. The revised title "Middle
> Class..." ought to have been listed on the Titles page, which is where
> GovTrack picks up the data. I'm not sure what to do in that case.
> And thanks for pointing out the missing parse of that action line. I'm
> re-parsing all bill data, back to 1990 or whatever it is.
> - Josh Tauberer
> - CivicImpulse / GovTrack.us
> http://razor.occams.info | www.govtrack.us | civicimpulse.com
> "Members of both sides are reminded not to use guests of the
> House as props."
> On 12/13/2010 05:21 PM, Eric Mill wrote:
>> Aaaaand, another breakage in the same bill, not sure if it's related to
>> THOMAS' changes or not. Hope my repeated emails aren't too aggravating. :)
>> There's an <action> element that should be a <vote>:
>> <action datetime="2010-12-02T15:55:00-05:00"><text>On motion that the
>> House agree with an amendment to the Senate amendment Agreed to by the
>> Yeas and Nays: 234 - 188 (Roll no. 604).</text><reference label="text as
>> House agreed to Senate amendment with amendment" ref="CR
>> -- Eric
>> On Mon, Dec 13, 2010 at 5:14 PM, Eric Mill <eric@...
>> <mailto:eric@...>> wrote:
>> Yep, I can already see a breakage -
>> THOMAS lists the "Latest Title" (which just changed wording from
>> "Title") of HR 4853 (the tax-cut compromise bill) as the Middle
>> Class Tax Relief Act, but GovTrack doesn't have that title. On the
>> "Titles" page, THOMAS still only mentions the Middle Class Tax
>> Relief Act title in the context of the "Latest Title".
>> -- Eric
>> On Mon, Dec 13, 2010 at 4:37 PM, Eric Mill
>> <eric@... <mailto:eric@...>>
>> This may pose a risk to your scrapers. Though the Dublin Core
>> elements may simplify getting at least those elements.
>> Also, I'm not entirely sure, but I think the link to FDSys
>> instead of GPO will mean more accurate bill text archiving?
>> -- Eric
>> ---------- Forwarded message ----------
>> THOMAS: The Last Update of the Year
>> December 13th, 2010 by Andrew Weber
>> This has been an exciting year working on THOMAS
>> <http://thomas.loc.gov/>. We started enhancing the site
>> inJanuary <http://thomas.gov/home/whatsnew.html> on its
>> fifteenth anniversary
>> <http://www.loc.gov/today/pr/1995/95-002.html>, expanded some of
>> the improvements in June
>> <http://thomas.gov/home/whatsnew062010.html>, andrevamped it
>> more during the summer recess
>> There’s been some nice press
>> Our goal at the start of the year was to roll out four sets of
>> enhancements. With only a few weeks to spare, we’ve done it.
>> In this fourth and final update of the year, THOMAS links toGPO
>> Access <http://www.gpoaccess.gov/> were converted to FDsys
>> <http://www.gpo.gov/fdsys/search/home.action>, search was
>> enhanced, more detail was added to the Bill Summary & Status
>> display, and additional metadata was added. The complete list
>> of enhancements
>> <http://thomas.gov/home/whatsnew120710.html> includes:
>> *FDsys Links*
>> The Government Printing Office (GPO) is phasing out GPO
>> Access. All GPO Access links in THOMAS have been changed so
>> that they correctly link to the new FDsys website. For
>> example, THOMAS previously linked to GPO for the PDF of
>> bills. Now it links to FDsys.
>> *Search Enhancement*
>> The Bill Summary & Status Search results used to be sorted
>> alphabetically so that, e.g., HConRes appeared before HR.
>> Users requested that the most important bill types appear
>> first. Now, the new order will get users to what they want
>> faster and will help THOMAS users find the most important
>> bills from a search.
>> *Latest Action*
>> The Bill Summary & Status page will now display the Latest
>> Action (where it differs from Latest Major Action). The
>> Latest Major Action in the Bill Summary & Status display
>> does not always show actions considered important to our
>> data providers.
>> *Latest Title*
>> The title on the Bill Summary & Status display is now listed
>> as Latest Title. A bill can have many titles and the Bill
>> Summary & Status component page “Titles” displays all titles
>> of a bill. The title displayed on the Bill Summary & Status
>> was labeled “Title.” It was either the most recent official
>> title (if there was no short title) or the most recent short
>> title. Changing it to say “Latest Title,” is now more
>> *Cosponsors’ List*
>> Cosponsors are now listed in a single column. Previously,
>> users had to read across two columns, which was confusing.
>> Now, the new order improves usability and avoids the
>> difficulty that was previously experienced in reading
>> cosponsor names.
>> *Metadata Enhancement*
>> Dublin Core meta tags were added to the top level Bill
>> Summary & Status display page. In addition to the previous
>> metadata for bills, there are Dublin Core tags for title,
>> creator (the sponsor in THOMAS), date, identifier (the
>> THOMAS handle), and type (bill type).
>> Daniel Schuman
>> Director | Advisory Committee on Transparency
>> Policy Counsel | The Sunlight Foundation
>> o: 202-742-1520 x 273 | c: 202-713-5795 | @danielschuman