Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping a vote to a specific text version of a bill #313

Open
TTrapper opened this issue Sep 18, 2024 · 3 comments
Open

Mapping a vote to a specific text version of a bill #313

TTrapper opened this issue Sep 18, 2024 · 3 comments

Comments

@TTrapper
Copy link

TTrapper commented Sep 18, 2024

Once I've downloaded votes and bill texts, is there a way to map them? I'm currently doing this manually by extracting the billI_id from the vote data, then searching through the downloaded bills. It works but I'm still not sure if/how I can map a vote to a particular text_version (eg ih, pcs, eh, etc). Thanks!

@JoshData
Copy link
Member

I don't think there is a direct way to map a vote to a text version. There's no text version in the XML vote data from the House or Senate. Often the text of what is being voted on hasn't been published yet, in some cases that's because the vote outcome is what causes the new text to be published. And there's no vote ID in the govinfo text MODS metadata. While it would be helpful, there are a lot of votes that don't have IDs (anything that isn't a roll call vote).

The govinfo bill text MODS metadata has a "Last Action Date Listed" field (which I guess is the originInfo/dateIssued element) which might correspond to the date of a congressional action, but I recall that I haven't found it reliable. I think one reason is that fast-moving bills can have multiple significant actions on the same date.

For GovTrack, I have a map from bill text versions to bill status codes that are emitted by this project. The bill status codes are determined by parsing the govinfo BILLSTATUS XML data's action list. That usually works well. https://github.com/govtrack/govtrack.us-web/blob/main/bill/billtext.py

@TTrapper
Copy link
Author

TTrapper commented Sep 19, 2024

Ok this is very interesting, and a bit surprising since the vote data is difficult to interpret if we can't say what exactly they were voting on. I understand this an upstream issue and not a failing of this code-base.

What do you think of an approach like the following, where I parse the BILLSTATUS XML and try to find the text version whose date is on or before the date of the vote?

edit: an import caveat is that I am only intereset in votes that have to do with the passage of a bill, so if this method has flaws in other scenarios that's fine.

def get_voted_text_version(vote_datetime, bill_status_xml):
    latest_textversion_type = None
    latest_textversion_date = None

    # Find the closest text version before the given vote date
    text_versions = bill_status_xml.find('.//textVersions')
    if text_versions is not None:
        for version_item in text_versions.findall('./item'):
            version_date = parse_date(version_item.find('date').text).astimezone(timezone.utc)

            # We're interested in the latest text version that's still before the vote
            if version_date <= vote_datetime:
                if latest_textversion_date is None or version_date > latest_textversion_date:
                    latest_textversion_date = version_date
                    latest_textversion_type = version_item.find('type').text

    # Map the found text version type to the corresponding code
    type_code_map = {
        'engrossed in house': 'eh',
        'introduced in house': 'ih',
        'received in senate': 'rds',
        'referred in senate': 'rfs',
        'placed on calendar senate': 'pcs',
        'engrossed in senate': 'es',
        'engrossed amendment senate': 'eas',
        'reported in house': 'rh',
    }

    return type_code_map.get(latest_textversion_type.lower()) if latest_textversion_type else None

@JoshData
Copy link
Member

I think you'd have to try it on a number of bills to see if it works well enough for you. There will definitely be edge cases where it fails. There always are!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants