Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited Vocabulary Writing #108

Open
DHDPIC opened this issue Nov 21, 2014 · 13 comments
Open

Limited Vocabulary Writing #108

DHDPIC opened this issue Nov 21, 2014 · 13 comments

Comments

@DHDPIC
Copy link

DHDPIC commented Nov 21, 2014

Please, sir, I want some more... words.

My first attempt at NaNoGenMo! I was working on this idea a short while ago and thought it might fit well with NaNoGenMo so giving it a go.

I was struck by some research presented to me on how many words children of certain ages can understand and say. Their understanding exceeds their ability to say; they can only say a few of the words they know. I am not exact on the data (and each child is different) but if I remember correctly a child of 18 months can understand between 50 - 200 words. As a child gets older this number rapidly grows.

So I thought it would be interesting to see what stories would look like when only a few words can be written or understood and the rest left unintelligible. And this seems like something a computer program could work out pretty easily: establish the most popular words in a story and only render the most popular ones, blanking out the remaining.

I have used Charles Dickins' Oliver Twist as my source text, as I thought the line 'Please, sir, I want some more' was an appropriate hook into the concept, as well as being about a child. I found the text on the Project Gutenberg website, and simply stripped out some unnecessary opening and closing legal text.

I used Processing to ingest and process the text and output the new version as a raw text file. I have outputted different versions for a vocabulary of 50, 200, and 1000 words. I will try to use InDesign to try tand make a version easier on the eye!

I'm new to this so will work out how to upload the outputted text, and the source code soon.

Here is the source code, outputted raw text, and PDFs:
https://github.com/DHDPIC/Limited-Vocab-Writing

Any questions or advice please let me know!

Thanks,

David
@DHDPIC

@ikarth
Copy link

ikarth commented Nov 21, 2014

Easiest way to upload a simple text file is as a Gist.

@MichaelPaulukonis
Copy link

Processing, oh my!

@DHDPIC
Copy link
Author

DHDPIC commented Nov 21, 2014

Thanks ikarth.

MichaelPaulukonis, I hope that is a good 'oh my!'

@DHDPIC
Copy link
Author

DHDPIC commented Nov 21, 2014

This is what the raw text looks like:

'Oh, you must not **** about ***** ***.'

And I have an excerpt in 50 word vocab and 200 word vocab, linked below:
https://gist.github.com/DHDPIC/8b8312d36c1ea0818657
https://gist.github.com/DHDPIC/d5a0401910231c9ae9cf

@MichaelPaulukonis
Copy link

When you first mentioned "limited vocabulary" I thought of Dr. Seuss and The Cat in the Hat.

Would it be possible to replace the redacted words with synonyms that are already allowed? Thus "saving" the text, but reducing the vocab?

Alternatively, could you use the unicode character black vertical rectangle? That would be pretty f█████g cool! If not, you can ██ ███ ████████!

@DHDPIC
Copy link
Author

DHDPIC commented Nov 21, 2014

Interestingly that is pretty much what I have done in InDesign using GREP styling to treat the * characters differently to the rest of the text. Quick grab below:
screen shot 2014-11-21 at 17 23 19

@DHDPIC
Copy link
Author

DHDPIC commented Nov 21, 2014

I also wanted to something with learning and repetition, so how many times does a word have to be encountered to learn and commit to memory. Unfortunately I couldn't find any good data on this, so abandoned it. If anyone does have any data on learning time/incidence, then I'd love to know more!

@DHDPIC
Copy link
Author

DHDPIC commented Nov 26, 2014

OK just posted my code and the outputted text files to github! Check it out:
https://github.com/DHDPIC/Limited-Vocab-Writing

@DHDPIC
Copy link
Author

DHDPIC commented Nov 26, 2014

And I have added some PDF versions of the text. Much nicer to look at!

@DHDPIC
Copy link
Author

DHDPIC commented Nov 26, 2014

Not sure how/if I add a completed or preview label...

@hugovk
Copy link
Collaborator

hugovk commented Nov 26, 2014

PDFs look great!

Labels are added by repo owner @dariusk.

@DHDPIC
Copy link
Author

DHDPIC commented Nov 26, 2014

Thanks!

@hugovk
Copy link
Collaborator

hugovk commented Nov 27, 2014

@DHDPIC Labelled!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants