Skip to content

Commit

Permalink
lower case names before remove accents (#31)
Browse files Browse the repository at this point in the history
Names that begin with accent characters were being removed in the lists
  • Loading branch information
id3s3c committed May 27, 2021
1 parent 4b52b0c commit e2829f4
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions linkedin2username.py
Original file line number Diff line number Diff line change
Expand Up @@ -596,6 +596,9 @@ def clean(raw_list):
allowed_chars = re.compile('[^a-zA-Z -]')
for name in raw_list:

# Lower-case everything to make it easier to de-duplicate.
name = name.lower()

# Try to transform non-English characters below.
name = remove_accents(name)

Expand All @@ -604,9 +607,6 @@ def clean(raw_list):
# People like to feel special, I guess.
name = allowed_chars.sub('', name)

# Lower-case everything to make it easier to de-duplicate.
name = name.lower()

# The line below tries to consolidate white space between words
# and get rid of leading/trailing spaces.
name = re.sub(r'\s+', ' ', name).strip()
Expand Down

0 comments on commit e2829f4

Please sign in to comment.