Nonsense language | Real meaning

I’ve been playing with Storied, a voice to text writing tool, both on my own and recently with my daughter.

My writing process consists of:
1) consuming way too much information
2) trying to blurt it all out in one go
3) trying to make connections more explicit between topics and ideas I intuitively associated

One of the features I like about Storied is that it asks follow up questions to help you know where to expand ideas, provide specific examples and make connections stronger between topics.

Writing to me is about learning to express myself clearly, a lifelong challenge.

As my daughter learns to communicate she sometimes gets frustrated as exclaims “you’re not understanding”, which is different from “you’re not listening” and can also mean “you said no and I don’t like that”.

My daughter likes to make up words, especially long names and fanciful stories. I continually feel guilty I’m not capturing this creative enthusiasm.

When she talks to her grandparents sometimes they think they need new hearing aids or a better connection because she has the cadence and intonation of her pseudo language down so well it masks the nonsense words.

As her stories get more complex they sound more like Lewis Carroll’s Jabberwocky than baby babble.

I wondered how much better Storied could do in interpreting the tall tales than parents, teachers and grandparents? You’d maybe have to set up a RAG reference of Disney materials, the daycare class curriculum and family history to have a chance at accurately guessing the meanings.

I did two tests to explore how Storied manages true nonsense vs structured fantasy with nonsense words.

First I had my daughter “tell me you plan” resulted in her talking about colors she likes and tasks for the day.

Then in a separate recording I read the Jabberwocky to Storied.

My comparison test failed because it seems like the poem was recognized in whole and even the nonsense words were spelled “correctly” in the transcript. Also the synthesized output perfectly outlines the Hero’s Journey.

I’d like to redo my test with a newly created scrambled story, maybe written like madlibs with replacement nouns and verbs. And just like that I find myself wanting to use synthetic data to test a real world communication problem.

There is probably no Mutually Exclusive and Collective Exhaustive (MECE) testing plan for LLMs and I’m no linguistics expert, but coming up with some ad hoc tests to determine the applicability and limitations of available tools can also be fun.

Leave a comment