Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Inferred Labels for Stacked Bar Charts #145

Merged

Conversation

iantei
Copy link
Contributor

@iantei iantei commented Aug 16, 2024

A. Introduce processing functionalities in scaffolding.py for inferred labels.

  1. Introduce load_viz_notebook_inferred_data(), filter_inferred_trip() and expand_inferredlabels() for processsing, filtering and expanding inferred labels.
  2. map_trip_data() to extract the mapping functionality.

…() and expand_inferredlabels() for processsing, filtering and expanding inferred labels. 2. map_trip_data() to extract the mapping functionality.
…and incorporate inferred label for Distribution of modes.
…Metrics 2. Update quality_text, fig, ax, text_results and introduce new plot_and_text_stacked_bar_chart() for all Stacked Bar Charts to represent inferred labels bar in generic_metrics notebook
…dd query for mode_of_interest for inferred labels 3. Update fig, ax, text_results, plot_and_text_stacked_bar_chart() for all Stacked Bar Charts.
…_ 2. Update plot_and_text_stacked_bar_chart() for Distribution of modes in commute trips
… stacked_bar_quality_text and stacked_bar_quality_text_inferred with plot_and_text_stacked_bar_chart() 3. Adjust plot_title to plot_title_no_quality
@iantei
Copy link
Contributor Author

iantei commented Aug 20, 2024

Test Scenario:

Program: cortezebikes

Both generic_metrics and mode_specific_metrics notebook ran successfully, and the results look good.

Notebook execution for generic_metrics and mode_specific_metrics:

(emission) root@c5aa29285331:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb
(emission) root@c5aa29285331:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-08-19T23:52:08.740014+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-08-01T00:00:00+00:00]>)
Running at 2024-08-19T23:52:08.783100+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:52:21.884262+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:52:30.599255+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:52:39.148847+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:52:47.216309+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:52:55.290742+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:03.383849+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:11.496653+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:19.128968+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:26.599253+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:34.157006+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:41.196433+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:48.375874+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:53:55.613434+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:54:02.772060+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-08-19T23:54:09.801942+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
(emission) root@c5aa29285331:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-08-19T23:55:37.597376+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-08-01T00:00:00+00:00]>)
Running at 2024-08-19T23:55:37.638438+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:55:45.093898+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:55:50.707597+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:55:56.760308+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:02.884374+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:09.161714+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:14.766870+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:19.879560+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:24.874631+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:30.514902+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:36.168544+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:41.120830+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:46.193490+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:51.192843+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:56:56.136979+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-08-19T23:57:01.152459+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
(emission) root@c5aa29285331:/usr/src/app/saved-notebooks#

Results:

Charts Type Charts
All Stacked Bar Charts All Charts
Number of Trips with Table Number of Trips

@Abby-Wheelis
Copy link
Member

The charts look great! I'll look at the code next, but I do have one quick question - could a given trip be in the sensed, inferred, and labeled charts? Some of these look like labeled & inferred add to about 100%, but that is probably coincidence

Copy link
Member

@Abby-Wheelis Abby-Wheelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than this variable name that might need to be changed LGTM!


return expanded_ct, file_suffix, quality_text, debug_df

def map_trip_data(df, study_type, dynamic_labels, dic_re, dic_pur):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more specific variable name that could be used in places of df?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the variable name from df to expanded_trip_df, since the parameter is placeholder for both expanded_ labeled trips, and inferred trips.

@iantei
Copy link
Contributor Author

iantei commented Aug 21, 2024

The charts look great! I'll look at the code next, but I do have one quick question - could a given trip be in the sensed, inferred, and labeled charts? Some of these look like labeled & inferred add to about 100%, but that is probably coincidence

We can label a trip detected as certain mode of commute to be the same or different as the one detected.
I am not sure about how labeled & inferred add up to 100.

@iantei iantei marked this pull request as ready for review September 3, 2024 22:02
Copy link
Member

@Abby-Wheelis Abby-Wheelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the more specific variable name, looks good to me!

@shankari
Copy link
Contributor

shankari commented Sep 12, 2024

I can take a look at the code, but some high level comments just by looking at the charts first:

  • we should have the inferred mode in between the labeled and sensed. In general, we expect that the number of trips considered will be labeled < inferred < sensed, with sensed at 100%. We display the bars in that order for clarity
  • I really question the numbers generated for the inferred modes. Per the charts above, labeled e-bike trips were around 33%, and sensed bicycling trips were around 5%. But 75% of inferred trips were apparently e-bike trips. That seems very unusual and I would like to see a deeper explanation for it before reviewing further.

I also took a brief look at the code, and I think that the current inferred_labels is only inferred labels. So if a trip had both inferred and user labels, we would use only the inferred labels even if they were wrong. That seems wrong and is certainly going to be confusing to users. The inferred labels bar should actually be "labeled and inferred" and we should only use the inferred labels for trips where they exist, but there is no user label

@iantei
Copy link
Contributor Author

iantei commented Sep 17, 2024

we should have the inferred mode in between the labeled and sensed. In general, we expect that the number of trips considered will be labeled < inferred < sensed, with sensed at 100%. We display the bars in that order for clarity

Rearranged the bars to showcase from top to bottom: labeled - inferred - sensed.

I also took a brief look at the code, and I think that the current inferred_labels is only inferred labels. So if a trip had both inferred and user labels, we would use only the inferred labels even if they were wrong. That seems wrong and is certainly going to be confusing to users. The inferred labels bar should actually be "labeled and inferred" and we should only use the inferred labels for trips where they exist, but there is no user label

I have updated the filter to select trips which has either user_input or inferred_labels. Iterate through the filtered df - if there is user_input chose it, else look for inferred_labels. This way, we will have user label + inferred labels for inferred bars which would be "labeled and inferred".

@iantei
Copy link
Contributor Author

iantei commented Sep 17, 2024

I need some clarification.

- Filter out the trips which has inferred_labels: | Approach A
OR
- Filter out the trips which has either inferred_labels or user_input:  | Approach B

   - Check if the trip has user_input:
        - Use the labels from user_input over the labels from inferred_labels
   - If the trip does not have user_input:
       - Select the labels from inferred_labels

I have proceeded to implement with Approach B.

I think it'd also be a good idea to account for a threshold of 'p' value when showcasing inferred_labels. There could be cases where we would not have user_input but the inferred_labels's p value is low. In that case, we might not be representing good inferred_labels data to the audience.

@iantei
Copy link
Contributor Author

iantei commented Sep 17, 2024

I really question the numbers generated for the inferred modes. Per the charts above, labeled e-bike trips were around 33%, and sensed bicycling trips were around 5%. But 75% of inferred trips were apparently e-bike trips. That seems very unusual and I would like to see a deeper explanation for it before reviewing further.

This is the updated chart with the Approach B.
image

I tried to follow the below approach to understand the data better:

I enlisted all the _id of the trip which had mode_confirm as e-bike. Then iterated through the list of these _id over the sensed_df to see what primary_mode they were mapped into.

expanded_ct, ... = scaffolding.load_viz_notebook_data()
label_df = expanded_ct.copy()
label_ebike_id = label_df[label_df['mode_confirm'] == 'e-bike']._id

expanded_ct_sensed, ... = scaffolding.load_viz_notebook_sensor_inference_data()
in_vehicle_counter = 0
bicyling_counter = 0
unknown_counter = 0
other_counter = 0
for item in label_ebike_id:
    value_primary_mode = sensed_df[sensed_df['_id'] == item]['primary_mode']
    if ('IN_VEHICLE' == value_primary_mode.iloc[0]):
        in_vehicle_counter = in_vehicle_counter + 1
    elif ('BICYCLING' == value_primary_mode.iloc[0]):
        bicyling_counter = bicyling_counter + 1
    elif('UNKNOWN' == value_primary_mode.iloc[0]):
        unknown_counter = unknown_counter + 1
    else:
        other_counter = other_counter + 1

print("\n In Vehicle counter", in_vehicle_counter)
print("\n Bicycling counter", bicyling_counter)
print("\n Unknown counter", unknown_counter)
print("\n Others", other_counter)

In Vehicle counter 472

 Bicycling counter 403

 Unknown counter 135

 Others 113
 

It seems like e-bike trip is being mapped not just to BICYCLING, but also IN_VEHICLE and others. I will need to investigate further.

@Abby-Wheelis
Copy link
Member

I think one reason that the e-bike trips could be sensed as IN_VEHICLE, especially if that is uncommon in other programs, might be that the cortez program was for individuals 55+ and from the photo on the program website it seems that many of the bikes are e-trikes.

@shankari
Copy link
Contributor

shankari commented Sep 17, 2024

@iantei @Abby-Wheelis correct, I fully anticipate that sensed_mode is off, since I did not focus on e-bikes while making the original algorithms. I was questioning the inferred mode numbers (e.g. 75% of trips, at a proportion that is way off from labeled and sensed).

I really question the numbers generated for the inferred modes. Per the charts above, labeled e-bike trips were around 33%, and sensed bicycling trips were around 5%. But 75% of inferred trips were apparently e-bike trips. That seems very unusual and I would like to see a deeper explanation for it before reviewing further.

@iantei
Copy link
Contributor Author

iantei commented Sep 18, 2024

After filtering the labels form inferred_labels which has 'p' value greater than the confidence_threshold, for each trip.

image

Referenced approach from inferFinalLabels(trip: CompositeTrip, userInputForTrip?: UserInputMap) in multilabel/confirmHelper.ts

@iantei
Copy link
Contributor Author

iantei commented Sep 18, 2024

Testing Scenario:

Dataset - Cortezebikes
Executed the generate_plots.py scripts:

Details of the script execution



(emission) root@10452a8ca805:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb
(emission) root@10452a8ca805:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-09-18T17:45:51.672767+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-09-01T00:00:00+00:00]>)
Running at 2024-09-18T17:45:51.714859+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:46:08.116991+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:46:18.791023+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:46:30.587025+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:46:41.741453+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:46:52.601164+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:03.967915+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:15.438245+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:25.590965+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:35.290194+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:44.704726+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:47:52.946141+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:48:01.306479+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:48:09.504071+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:48:18.127332+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:48:26.842375+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-18T17:48:35.259400+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]


(emission) root@10452a8ca805:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-09-18T17:49:30.059998+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-09-01T00:00:00+00:00]>)
Running at 2024-09-18T17:49:30.102145+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:49:41.214641+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:49:48.389321+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:49:56.207088+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:03.443844+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:10.960276+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:18.104567+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:24.333696+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:30.562742+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:37.424396+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:44.050781+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:50.139393+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:50:56.290490+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:51:02.576195+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:51:08.616871+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:51:14.938528+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-18T17:51:21.018257+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
(emission) root@10452a8ca805:/usr/src/app/saved-notebooks#

Result:

image

@iantei
Copy link
Contributor Author

iantei commented Sep 18, 2024

After merging with main:

image

The charts look identical. The chart for Total trip length covered by mode has "Other" [3032] split into "Other" [1306] + "Airplane" [1726] for Labeled
and "Other" [4925] split into "Other" [3199] + "Airplane" [1726] for Labeled and Inferred.

Similarly, we do not have Airplane trips in Total trip length covered by mode in land.

image

@Abby-Wheelis
Copy link
Member

Abby-Wheelis commented Sep 20, 2024

On some of the bars next to sensed mode it says "(100% of all trips)" and some it says "(100%)" - can you standardize that please? Other than that slight issue the charts look great, I'll move on to reviewing the code itself

Copy link
Member

@Abby-Wheelis Abby-Wheelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few notes/questions here

if (max_entry['p'] > row.confidence_threshold):
max_labels_list.append(max_entry['labels'])
else:
max_labels_list.append({})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if there are no user labels and the confidence is too low there won't be any labels? Do these entries get filtered out later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever there is no user labels and the 'p' value for the inferred_labels is less than the confidence_threshold. We are not considering the inferred labels of mode_confirm, purpose_confirm and replaced_mode, therefore an empty dictionary is being added.
Later these columns would be converted as NaN for mode_confirm, purpose_confirm and replaced_mode while creating a dataframe, which is subsequently added to the original dataframe.

When I looked up for

(expanded_ct_inferred['mode_confirm'].value_counts(dropna=False), expanded_ct_inferred['Mode_confirm'].value_counts(dropna=False))

I observed the below:

mode_confirm:


e-bike               1868
 drove_alone          1101
 shared_ride           768
 NaN                   655
 walk                  467
 atv                    42
 e_car_shared_ride      15
 e_car_drove_alone      13
 not_a_trip             12
 bike                    8
 side_by side            6
 free_shuttle            3
 lawn_mower              2
                         2
 moco_transit            2
 air                     2
 atv_ride                1
 Name: mode_confirm, dtype: int64

Mode_confirm:

E-bike                  1868
 Gas Car, drove alone    1101
 Gas Car, with others     768
 Other                    710
 Walk                     467
 E-car, with others        15
 E-car, drove alone        13
 Not a Trip                12
 Regular Bike               8
 Free Shuttle               3
 Airplane                   2
 Name: Mode_confirm, dtype: int64)

The values processed as NaN was eventually converted to Other, which is wrong.

Do these entries get filtered out later?

I have to filter it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considered the case of filtering with NaN values, but there could be some existing NaN values apart from the way it was computed earlier. Decided to pass uncertain values for the keys to mode_confirm, purpose_confirm and replaced_mode in the dictionary, and add it to the list.
After the convergence with the original dataframe, filter out the rows which have these mode_confirm, purpose_confirm and replaced_mode as uncertain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this for now, but I would expect to revisit and discuss as part of/after the footprint calculations. If the user labels are not present, they are mapped to NaN and we drop them (dropna). Why do we want to handle this differently? (i.e. why don't we just dropna again?) Given that we already indicate that we only represent a subset of the trips, why do we also need to label some of those as uncertain?

else:
max_labels_list.append(row.user_input)

inferred_only_labels = pd.DataFrame(max_labels_list, index=inferred_ct.index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is really inferred / user labels now, I think it might be best to rename it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to labeled_inferred_labels

…append the labels_list with dict - uncertain for all labels. Later filter it out from the dataframe.
…emove reset_index on expanded_inferred_ct dataframe.
…d update the variable names to add prefix of labeled. We display both labeled and inferred labels altogether for inferred bars in stacked bar charts.
…await from notebook. Update the map_trip_data() to be async, and call to it as await in scaffolding.py
@iantei
Copy link
Contributor Author

iantei commented Sep 21, 2024

Testing scenario:

Dataset used - cortezebikes

Executed notebooks -

  • generic_metrics
  • mode_specific_metrics
Details - Execution of generic/mode_specific_metrics notebook

(emission) root@59aa16cf1f50:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/update_mappings.py mapping_dictionaries.ipynb
(emission) root@59aa16cf1f50:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py generic_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-09-21T15:52:36.595701+00:00 with args Namespace(plot_notebook='generic_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-09-01T00:00:00+00:00]>)
Running at 2024-09-21T15:52:36.632798+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:53:40.155024+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:54:09.492127+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:54:26.296376+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:54:41.331673+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:54:57.628128+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:55:13.618390+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:55:30.041076+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:55:43.296585+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:55:55.867052+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:56:07.909210+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:56:17.319844+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:56:27.187683+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:56:36.548876+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:56:49.497647+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]



Running at 2024-09-21T15:57:44.017013+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]
Running at 2024-09-21T15:58:00.285728+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True), Parameter('sensed_algo_prefix', str, value='cleaned')]




(emission) root@59aa16cf1f50:/usr/src/app/saved-notebooks# PYTHONPATH=.. python bin/generate_plots.py mode_specific_metrics.ipynb default
/usr/src/app/saved-notebooks/bin/generate_plots.py:30: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if r.status_code is not 200:
About to download config from https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/cortezebikes.nrel-op.json
Successfully downloaded config with version 1 for Cortez 55+ eBike Program and data collection URL https://cortezebikes-openpath.nrel.gov/api/
label_options is unavailable for the dynamic_config in cortezebikes
Running at 2024-09-21T15:58:50.494373+00:00 with args Namespace(plot_notebook='mode_specific_metrics.ipynb', program='default', date=None) for range (<Arrow [2023-06-01T00:00:00+00:00]>, <Arrow [2024-09-01T00:00:00+00:00]>)
Running at 2024-09-21T15:58:50.593594+00:00 with params [Parameter('year', int), Parameter('month', int), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T15:59:14.890521+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T15:59:26.086538+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T15:59:36.275608+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T15:59:45.627643+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T15:59:54.641914+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=10), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:03.630425+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=11), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:11.953019+00:00 with params [Parameter('year', int, value=2023), Parameter('month', int, value=12), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:19.812692+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=1), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:28.188906+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=2), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:36.502092+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=3), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:44.022756+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=4), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:00:52.916952+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=5), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:01:00.935062+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=6), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:01:08.922318+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=7), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:01:16.835463+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=8), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]
Running at 2024-09-21T16:01:24.251592+00:00 with params [Parameter('year', int, value=2024), Parameter('month', int, value=9), Parameter('program', str, value='default'), Parameter('study_type', str, value='program'), Parameter('mode_of_interest', str, value='e-bike'), Parameter('include_test_users', bool, value=False), Parameter('dynamic_labels', dict, value={}), Parameter('use_imperial', bool, value=True)]

Result:
image

image

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this now, barring the future cleanup items I have listed here.

Comment on lines +203 to +207
"\n",
"inferred_match = re.match(r'Based on ([0-9]+) confirmed trips from ([0-9]+) (users|testers and participants)\\nof ([0-9]+) total trips from ([0-9]+) (users|testers and participants) (\\(([0-9.]+|nan)%\\))', quality_text_inferred)\n",
"stacked_bar_quality_text_inferred = f\"{inferred_match.group(1)} trips {inferred_match.group(7)}\\n from {inferred_match.group(2)} {inferred_match.group(3)}\"\n",
"\n",
"stacked_bar_quality_text_labeled, stacked_bar_quality_text_sensed, stacked_bar_quality_text_inferred"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we fix/clean this up soon. Even if we wanted to generate parametrized text, it is a lot clearer to have a structure with the parameters instead of parsing out from existing text using regular expressions. And now that we have converted over to bar charts, we don't need the backwards compat of the old Based on... text.

if len(labeled_inferred_ct) == 0:
return labeled_inferred_ct

def _select_max_label(row):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am merging this for now, but note that there is a similar function in the server when we handle metrics to include trips with confidence > 90% as "confirmed". We do not and should not need to reinvent code.

https://github.com/e-mission/e-mission-server/blob/52adee205f686d87e167bd4b1d166098938870c6/emission/analysis/result/metrics/time_grouping.py#L134

Comment on lines +172 to +175
"Trips_with_at_least_one_label": len(labeled_ct),
"Trips_with_mode_confirm_label": trip_label_count("Mode_confirm", expanded_ct),
"Trips_with_trip_purpose_label": trip_label_count("Trip_purpose", expanded_ct)
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm merging this for now, but this should be switched to mode_confirm or mode_confirm_with_other, right? We should not be working with the display values in any internal code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this would need to switch to mode_confirm_w_other variable.


async def map_trip_data(expanded_trip_df, study_type, dynamic_labels, dic_re, dic_pur):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the reason for refactoring and pulling this out, but wonder why you chose to rename the parameter. It has lead to a lot of changed lines that are essentially just the variable rename. Not asking for a change, just an explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are calling map_trip_data(expanded_trip_df, ...) from two functions -
load_viz_notebook_inferred_data(expanded_it, ...) & load_viz_notebook_data(expanded_ct, ...).
Therefore, I wanted to use a variable name which would suit both the parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Tasks completed
Development

Successfully merging this pull request may close these issues.

3 participants