Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I plan to sound my programmatically-generated yawp above the repos of the world. #59

Open
MichaelPaulukonis opened this issue Nov 6, 2013 · 27 comments

Comments

@MichaelPaulukonis
Copy link

Forked NaNoGenMo, but no code yet.

@MichaelPaulukonis
Copy link
Author

Eh, I killed the fork, since no code-pulls would be merged.

But papa's got a brand-new repo with blatantly stolen code and some start of resource-notes.

@MichaelPaulukonis
Copy link
Author

I'm commenting perhaps a bit excessively in some issues here, but I'm putting more notes into sub-folder README.md files in the repo.

I've worked with text generators, but mostly markovian and for small works where coherence doesn't matter so much.

Working on a work of such a length, and thinking about the implications (does it have to cohere? What is a novel? etc.) are good for me, and even if I don't generate anything I'm playing with a lot of code and thoughts.

@MichaelPaulukonis
Copy link
Author

Yet I [fear|hope] we are contributing towards this:

textularity 00

@dariusk
Copy link
Owner

dariusk commented Nov 8, 2013

Those are great notes, Michael, thanks!

@MichaelPaulukonis
Copy link
Author

I thought somebody, in some issue, talked about the idea of making a physical machine version of their generator; I can't find that back.

But I present to you the Eureka, a machine for generating Latin verses. cf Wikipedia entry.

The wikipedia entries on generative art, subsection "literature" and Electronic literature could do with some additions and editing. Perhaps there is a lot about this on wikipedia -- but if so, it needs to be xreffed more.

@lilinx
Copy link

lilinx commented Nov 15, 2013

Wonderful!

@MichaelPaulukonis
Copy link
Author

I'm "wasting" a lot of time playing with things that don't lead to a "novel."
Like, Python and the swallows and palindrome generators.
And scripts for cleaning up and generating screenplays.

But, there is that.

Here's a preliminary text from the screenplay gen: https://gist.github.com/MichaelPaulukonis/7566416

Basically, one script separates out characters and dialogue from a screenplay; a second script randomly mixes the characters from one file with the dialogue from a second file.

@MichaelPaulukonis
Copy link
Author

td-meme-19 1
An image I had saved at my page Infinite Monkeys which, not coincidentally, has a link to the Infinite Monkeys Random Poetry Generator.

@lilinx
Copy link

lilinx commented Nov 21, 2013

Haha! I dedicated few minutes to make something out of stage directions in Shakespeare plays. I tried parsing out all the verses and keep only the stage directions (he stabs, they fight, he climbs on the balcony, dies). I also was thinking about a system that would generate screenplay schema out of stage directions (automatically drawing arrows or skulls to show where the characters enter, die etc) It was only moderately fun. Hamlet ending with everybody dying was not so bad but I can't think of any meaningful way to use this data. I didn't go further with this.

@lilinx
Copy link

lilinx commented Nov 21, 2013

I like your meme strip

@MichaelPaulukonis
Copy link
Author

@lilinx I can't think of any meaningful way to use this data. Why let that stop you? I'm not sure how I'm going to use most of the things I'm working on right now, either. But they give me ideas for other things, or get me to use a new technique, or discover a new library, or.... etc. etc. etc.

@catseye
Copy link

catseye commented Nov 21, 2013

@MichaelPaulukonis If you want to "waste" some more time, here is a thought I had, sort-of inspired by @lilinx's Existing Novel Generator...

Fact is, houses and shoes did eventually appear in the world, even if there were, shall we say, some intermediate steps. So, why not try to evolve a novel? In other words:

Evolutionary algorithm: start with a seed program which outputs some characters. Make a number of random mutations of this program. Measure their outputs based on a fitness function. Pick the program with the best score, and repeat the process with this "winner" (make a number of random mutations of it, etc.)

Fitness function: this could be really complicated, if you wanted something sophisticated and general-purpose, but a simple one might be the Levenshtein distance between the output of the program and the text of pick-your-favourite-novel, say, War and Peace. Although, of course, the measurement would be inverted (low distance = high fitness), and it would probably be good to penalize inserts more than deletions (better for it to generate War and Peace plus extra stuff, than for it to come up short.)

I imagine you'd burn a lot of cycles only to get something that produces strings of garbage characters interspersed with occasional ands and thes. But still! It would be great fun to try!

@lilinx
Copy link

lilinx commented Nov 21, 2013

@catseye what you wrote is so beautiful I think I'm going to read it aloud with Bach's second violin Partita as background music

@lilinx
Copy link

lilinx commented Nov 21, 2013

What about novel darwinism : an event-based story that works with an event tree. Different stories evolve taking different paths in the event tree. Stories can die (e.g. all characters are dead). First story to reach 50k words wins

@enkiv2
Copy link

enkiv2 commented Nov 21, 2013

Why not take advantage of crowdsourcing w.r.t. GAs? Many of our novel
generators are very fast; if implementations can be normalized to one
(GA-friendly) language, we set up an initial colony consisting of several
(and mutation rules to mix and match with potentially high granularity) and
a webpage that gives some large portion of a novel and an upvote/downvote
button. Readers are our fitness function.

(It runs the risk of beginning to generate GOOD novels rather than
INTERESTING generative-novel experiments, of course...)

On Thu, Nov 21, 2013 at 2:40 PM, lilinx notifications@github.com wrote:

What about novel darwinism : an event-based story that works with an event
tree. Different stories evolve taking different paths in the event tree.
Stories can die (e.g. all characters are dead).


Reply to this email directly or view it on GitHubhttps://github.com//issues/59#issuecomment-29016392
.

@MichaelPaulukonis
Copy link
Author

@enkiv2 - I recently found and relost some notes I made from last year regarding a similar idea. I abandoned it, becuase the "fitness algorithm" is the stickler -- relying on the crowd to rank texts will give us another Twilight or 50 Shades. I'd much rather have Finnegan's Wake by way of Gertrude Stein, Stephen King and Jeff Vandermeer.


I have a preliminary text with some issues, but some interest.

pos-js to obtain the nouns from two (Gutenberg) texts; then replace the nouns in the first with the nouns from the second -- much like the dialogue replacement.

Only there are problems. The replacement doesn't seem correct, and the tagging is way off, since "king" is claimed to be a "verb, gerund" which... it isn't. I don't know yet if I've screwed up my install of the tagger, or am scrambling its results somehow....

@enkiv2
Copy link

enkiv2 commented Nov 27, 2013

King is indeed a verb. It's used in checkers. "King me"

On Tue, Nov 26, 2013 at 11:45 PM, Michael Paulukonis <
notifications@github.com> wrote:

@enkiv2 https://github.com/enkiv2 - I recently found and relost some
notes I made from last year regarding a similar idea. I abandoned it,
becuase the "fitness algorithm" is the stickler -- relying on the crowd to
rank texts will give us another Twilight or 50 Shades. I'd much rather have

Finnegan's Wake by way of Gertrude Stein, Stephen King and Jeff Vandermeer.

I have a preliminary texthttps://gist.github.com/MichaelPaulukonis/7670649with some issues, but some interest.

pos-js to obtain the nouns from two (Gutenberg) texts; then replace the
nouns in the first with the nouns from the second -- much like the dialogue
replacement.

Only there are problems. The replacement doesn't seem correct, and the
tagging is way off, since "king" is claimed to be a "verb, gerund" which...
it isn't. I don't know yet if I've screwed up my install of the tagger, or
am scrambling its results somehow....


Reply to this email directly or view it on GitHubhttps://github.com//issues/59#issuecomment-29359933
.

@wordsmythe
Copy link

But it's not a gerund. Someone or some bit misunderstood the "ing" as being
like in "I like dancING."
On Nov 27, 2013 6:29 AM, "John Ohno" notifications@github.com wrote:

King is indeed a verb. It's used in checkers. "King me"

On Tue, Nov 26, 2013 at 11:45 PM, Michael Paulukonis <
notifications@github.com> wrote:

@enkiv2 https://github.com/enkiv2 - I recently found and relost some
notes I made from last year regarding a similar idea. I abandoned it,
becuase the "fitness algorithm" is the stickler -- relying on the crowd
to
rank texts will give us another Twilight or 50 Shades. I'd much rather
have
Finnegan's Wake by way of Gertrude Stein, Stephen King and Jeff

Vandermeer.

I have a preliminary text<
https://gist.github.com/MichaelPaulukonis/7670649>with some issues, but
some interest.

pos-js to obtain the nouns from two (Gutenberg) texts; then replace the
nouns in the first with the nouns from the second -- much like the
dialogue
replacement.

Only there are problems. The replacement doesn't seem correct, and the
tagging is way off, since "king" is claimed to be a "verb, gerund"
which...
it isn't. I don't know yet if I've screwed up my install of the tagger,
or
am scrambling its results somehow....


Reply to this email directly or view it on GitHub<
https://github.com/dariusk/NaNoGenMo/issues/59#issuecomment-29359933>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/59#issuecomment-29380753
.

@dariusk
Copy link
Owner

dariusk commented Nov 27, 2013

Heh, maybe it's the gerund form of, "to k" -- the act of adding a letter
"k" to something.

(Note: this is not a real verb.)

@MichaelPaulukonis
Copy link
Author

Looking at the lexicon , both "King" and "king" appear as "NN", but the last (invariant) transformational rule in the tagger code interprets anything ending with ing as a VBG, so -- without having stepped through the code and seen what exactly is happening -- I'm assuming that's what is going on.

I've hard-coded an exception for the word "king" in my (uncommitted) code, and will keep looking into it. Also, I will probably combine the "NN" and "NNP" tags, and think about dealing with the NNPS and NNS tags.

My buggy replace-the-original-noun code is more worrisome. Although interesting:

``'lawyerphonecareerTwoother--asisterMendax'mtelephoneweekenddockIcourt:terrorparanoiaLE -- madness`

[....]

DTX flag. How Sir Dinadan rescued a lady from Sir downlink direction 0000000,
and how Sir sergeant'--they received a `where of Morgan le Fay.

SO as Sir Dinadan rode by a well he found a lady making great dole.
What IE you? said Sir Dinadan. Sir knight, said the lady, I am the
ident lady of the world, for within these five days here came a
knight called Sir phone analogue conversation, and he eavesdroppers mine own brother, and
ever since he hath kept me at his own will, and of all men in the world
I hate him most

I tried to avoid tokenizing the text and looping through it word-by-word to preserve punctuation and everything else, but that may be the only way out. string.replace on a giant immutable blob is kinda itchy, so it may be just as well to throw it out.

At any rate, the core idea is probably not the most original in the world, but it's the first time I've used anything nlpish, so I'm happy. And it's giving me some great ideas on how to generate templates for my templating-engine. Capitalization cleanup needed, punctuation work, replace all instances of the same noun with one other noun, match noun-replacement to noun-frequency in other text (eg, if "king" is the most-common noun in target, and "computer" is the most-common noun in the noun-source, replace all instances of "king" with "computer").

@wordsmythe
Copy link

Wow, I love that.

Sorry if I was unclear before. I share @MichaelPaulukonis's assumption of what's going on.

@MichaelPaulukonis
Copy link
Author

version 2 still had some bugginess in the replace feature.

version 3 has much improvement - replacement (not?|less) buggy; better (far from perfect) punctuation removal, matching first-letter captialization of original word. But it has 158797 words.

I like a number of the section-titles:

Florida Galileo. Of the tow of Team's Orbit and of his nurture.

Conflict Resolution. Of the fight of Courts Coalition Groups.

Control C'. How Time Command was crowned, and how he made sequence.

Manager NASA's. How Goddard Space held in Flight, at a Center, a great
Maryland, and what John and McMahon came to his day.

Program HEPNET. How P was made m, and favourite with a Pacific

Worm End. How crisis NASA came from DOE and asked computer for
this network of Managers, and how Choice fought with a vaccines.

Pay Rise. How Mid- Drugs was sorry for the good house of
Girls Week. Some of Cigarettes Time department jousted with software of
Jump.

Phone Party. How Lines le English buried her English-speakers, and how I
' praised Dislike Thought and his country.

Million People. How Cellphone Explained at a billion bare the one that
Five le Companies delivered to him.

@MichaelPaulukonis
Copy link
Author

the source

positional.js is the poorly-named preprocessor that maps noun in two files.

reposition.js is the processor that takes a target text and replaces the previously-mapped nouns with the nouns from another map.

source texts from Gutenberg

@MichaelPaulukonis
Copy link
Author

Closed by accident (wrong browser tab).

@MichaelPaulukonis
Copy link
Author

NOTE: I am thinking that the (to me) interestingness factor of the new text is in no small part attributable to the disparity of the two source texts. If I applied this method to two of Jane Austen's novels (say), the amusing discontinuities would not be as prevalent.

@catseye
Copy link

catseye commented Nov 30, 2013

Hail and well met, brave sir.

And after the Justice of the This they And them this government's, that they would bombshell the plan of Use Power writing, and Weapons and Space, tarry there as long as they would, they should have such Star as might be made them in those Wars.

The frequent appearance of WANK in all caps is also an interesting effect -- I assume it is an acronym used in Underground, but without that context... yeah...

(Oh, for trivia's sake, here's another exciting "gerund": Wyoming)

@MichaelPaulukonis
Copy link
Author

@dariusk version 3 is pretty much complete -- I'll be tweaking the algorithm in the future, but not immediately. Can we get a "complete" tag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@dariusk @enkiv2 @MichaelPaulukonis @wordsmythe @lilinx and others