Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#8193 strip commas from email response #8343

Merged
merged 9 commits into from
Jan 25, 2022
Merged

Conversation

sekmiller
Copy link
Contributor

@sekmiller sekmiller commented Jan 12, 2022

What this PR does / why we need it: fixes broken csv output from guestbook responses download

Which issue(s) this PR closes:

Closes #8193 Commas in email addresses breaks guestbook csv download

Special notes for your reviewer: Instead of removing commas from the email response, which had been the approach for all other responses, now all responses are enclosed in double quotes allowing the commas to remain without breaking the csv display

Suggestions on how to test this:

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to approve this because I think it'll allow guestbooks to be downloaded.

However, in the example by @jggautier in the issue, he was entering juliangautier@g.harvard.edu,in

As of this code I expect it to become juliangautier@g.harvard.eduin

I guess this is better? In either case, the ,in doesn't make sense to me. I don't know why it's there.

In a future pull request, perhaps we should add some input validation so that juliangautier@g.harvard.edu,in isn't accepted as a valid email address.

@pdurbin pdurbin removed their assignment Jan 12, 2022
@jggautier
Copy link
Contributor

jggautier commented Jan 12, 2022

Thanks for checking @pdurbin.

I used juliangautier@g.harvard.edu,in as an example because that's similar to what someone actually put in the email field in the guestbook popup. But people have also entered multiple email addresses in one field, separated by commas, and in those cases this PR removes those commas too and the email addresses will be one long string.

The discussion about this in a sprint planning meeting a while back made me think that we thought it was okay that people entered multiple email addresses in that email field, which would mean we should expect commas in that field. So when Danny wrote in #8193 that commas would be escaped, I thought that meant that the commas would appear in the email columns in the guestbook CSV files.

Can we agree that we're okay with people entering multiple emails in that email field and that it should expect commas? Otherwise, I agree that it would be better for the field to accept only one valid email or that the text box should accept multiple entries without them needing to be comma-separated.

@pdurbin
Copy link
Member

pdurbin commented Jan 12, 2022

@jggautier ok, thanks. I kicked it back to @sekmiller then. It sounds like we want to escape rather than strip.

For a future issue or pull request, if we except multiple addresses separated by commas, the UI should probably indicate this.

@sekmiller sekmiller removed their assignment Jan 13, 2022
@sekmiller
Copy link
Contributor Author

I re-did the coding so that all responses are double quotes which allows for commas to be displayed without breaking the csv output.

@coveralls
Copy link

coveralls commented Jan 14, 2022

Coverage Status

Coverage remained the same at 18.909% when pulling 393fae6 on 8193-guestbook-csv-bug into afe7543 on develop.

Copy link
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this appears to work with most software packages; all the CSV parsers I tried do understand that commas inside double-quoted strings are not field separators. I even tried our own CSV ingest; not becuase anyone would want/need to ingest guestbooks as tabular data, but because we use a standard Apache CSV parser library that's used in many other opensource packages.
(I thought when we talked about this issue during a meeting that we wanted encode these commas; as a hex notation maybe - '%2C'? - but this is even better probably... So, approving)

@landreev
Copy link
Contributor

landreev commented Jan 20, 2022

(P.S. Please note that there appears to be a merge conflict)

# Conflicts:
#
src/main/java/edu/harvard/iq/dataverse/GuestbookResponseServiceBean.java
@sekmiller sekmiller assigned sekmiller and unassigned landreev Jan 21, 2022
@sekmiller
Copy link
Contributor Author

@landreev that merge conflict was caused by the introduction of a method for escaping csv strings for the text area guestbook response PR. I applied the method to other response strings because it makes the code cleaner. i'll put it back in review if you want to take a look.

@sekmiller sekmiller removed their assignment Jan 21, 2022
@landreev
Copy link
Contributor

@landreev that merge conflict was caused by the introduction of a method for escaping csv strings for the text area guestbook response PR. I applied the method to other response strings because it makes the code cleaner. i'll put it back in review if you want to take a look.

Sorry about my confusion during the standup. I missed the comment above.
So this escapeCsv method does the same thing, adds double quotes... but only when necessary. OK, yes, this does seem cleaner.

@landreev
Copy link
Contributor

Oh yeah, we're still having an outbreak of the github automation plague (the approval didn't move the PR into QA automatically). Moving by hand...

@kcondon kcondon assigned kcondon and unassigned landreev Jan 25, 2022
@kcondon kcondon merged commit 0ea1f04 into develop Jan 25, 2022
@kcondon kcondon deleted the 8193-guestbook-csv-bug branch January 25, 2022 18:34
@djbrooke djbrooke added this to the 5.10 milestone Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Commas in email addresses break guestbook CSV files
7 participants