Skip to content

Update preprocess.py to support Python 3#30

Closed
mattbierner wants to merge 1 commit into
jcjohnson:masterfrom
mattbierner:preprocess-python3
Closed

Update preprocess.py to support Python 3#30
mattbierner wants to merge 1 commit into
jcjohnson:masterfrom
mattbierner:preprocess-python3

Conversation

@mattbierner
Copy link
Copy Markdown

Print statements to print function calls.
Use items instead of iteritems.

Tested on Python 2.7 and Python 3.5


Running preprocess.py under Python 3.3+ fixes #29

For the following text file:

Test 😀!

Here's the (incorrect) Python 2.7 json output:

{"idx_to_token": {"1": "T", "2": "e", "3": "s", "4": "t", "5": " ", "6": "\ud83d", "7": "\ude00", "8": "!", "9": "\n"}, "token_to_idx": {"!": 8, " ": 5, "e": 2, "\ude00": 7, "\n": 9, "s": 3, "T": 1, "\ud83d": 6, "t": 4}}

And the correct Python3.3 json:

{"idx_to_token": {"1": "T", "2": "e", "3": "s", "4": "t", "5": " ", "6": "\ud83d\ude00", "7": "!", "8": "\n"}, "token_to_idx": {"t": 4, "s": 3, " ": 5, "\ud83d\ude00": 6, "!": 7, "e": 2, "\n": 8, "T": 1}}

Print statements to print function calls.
Use `items` instead of `iteritems`.

Tested on Python 2.7 and Python 3.5

Running preprocess.py under Python 3.3+ fixes jcjohnson#29
@gokceneraslan
Copy link
Copy Markdown

Let's please consider #12 to get rid of the Python dependency, rather than enhancing it.

@ChrisCummins
Copy link
Copy Markdown
Collaborator

Python 3 support merged in 37a31bb, using six package for iterators, and removing a broken dep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

preprocess.py does't handle unicode character sequences correctly

3 participants