This is one of those cases where a deep screen-scrape of THOMAS goes
wrong because their interface isn't clear. GovTrack had been using
THOMAS's All Information page to get sponsor, cosposors, and actions
(which gives status). This page has had a long-standing bug that it
gets truncated if it is too long:
Actually I use this URL which isn't provided on the site which gives the
same minus the summary, so it gets truncated later, and I get the
summary by another fetch to the CRS Summary page specifically:
I reported this bug to them probably four years ago and never heard back
(except maybe for an initial acknowledgment).
Since the last few action lines are cut off, the scraper didn't get a
chance to see the bill was enacted.
Anyway, I've now switched to using the All Congressional Action page for
sponsor and actions which doesn't have this problem (or if it does, the
output is a bit shorter so at least in this case it doesn't get cut off):
Plus a second fetch to the page specific for cosponsors.
That brings the total number of page fetches per bill to 8, and since
they require accesses no more than one per second, this is why it takes
so long to update info.
I'll re-run the scrape on all bills this session (and later on earlier
sessions) to make sure this doesn't cause problems on other bills.
- Josh Tauberer
- CivicImpulse / GovTrack.us
| www.govtrack.us | civicimpulse.com
"Members of both sides are reminded not to use guests of the
House as props."
On 04/05/2010 04:23 PM, Eric Mill wrote:
> There's some kind of issue here - the bill status has now backtracked
> to just PASSED:BILL, and the<signed> element is missing entirely.
> And the GovTrack page for the bill now has the Signed By President
> checkbox as unchecked:
> -- Eric
> On Wed, Mar 31, 2010 at 3:48 PM, Eric Mill<eric@...> wrote:
>> Is there a delay between when a bill is signed into law, and gets a
>> <signed> item in its<actions> history, and when it gets an<enacted>
>> item? I ask because the reconciliation bill, HR 4872, has<signed>
>> but not<enacted> in its<actions> history.
>> The reconciliation bill:
>> Versus the senate health care bill:
>> -- Eric
> Yahoo! Groups Links