NEW SAVANNA

Wednesday, April 24, 2024

At last! ChatGPT does, shall we say, interesting imitations of “Kubla Khan”

As you may know, I have a long-term interest in Coleridge’s “Kubla Khan” and have written a lot of posts about it. If you scan through those posts you’ll find that some of the more recent one’s involve ChatGPT. Here’s my first attempt to have ChatGPT parody the poem:

How ChatGPT parodied “Kubla Khan” and pwned DJT45 at the same time

While the Chatster’s performance is not without interest, the parodies were not very good. A month and a half later I decided to give it another try:

Let’s give ChatGPT another crack at “Kubla Khan.”

Again, not without interest, but no cigar.

Meanwhile, I have managed to prompt ChatGPT into imitations of Wallace Stevens’ “Thirteen Ways of Looking at a Blackbird.” I did it by using the technique I’d used to investigate story-telling. I’d give it a story and then ask ChatGPT to use that as the basis for a new story where I specified that the new story have a different protagonist. ChatGPT could do whatever it wanted to do with the rest. That worked well for stories, and it worked will with Stevens as well. A month ago I asked ChatGPT to substitute “A.I.” for “blackbirds.” Just the other day I had it produce a poem about irises.

Perhaps it will work with “Kubla Khan.” I decided to restrict the exercise to the first 36 lines. First, I had it substitute “Elon Musk” for “Kubla Khan,” and then “FDR.” The results? Good poetry, no. Some of it is silly. But it’s more interesting than the earlier tries. Three cigars and a pop tart.

As always, I set my prompts in boldface and ChatGPT’s response in plain-face.

From Kubla Khan to Elon Musk

I’m going to give you the first 36 lines of “Kubla Khan.” I want you to base a new poem on that, using “Elon Musk” instead of “Kubla Khan.”

In Xanadu did Kubla Khan
A stately pleasure-dome decree:
Where Alph, the sacred river, ran
Through caverns meaureless to man
Down to a sunless sea.
So twice five miles of fertile ground
With walls and towers were girdled round:
And there were gardens bright with sinuous rills,
Where blossomed many an incense-bearing tree;
And here were forests ancient as the hills,
Enfolding sunny spots of greenery.

But oh! that deep romantic chasm which slanted
Down the green hill athwart a cedarn cover!
A savage place! as holy and enchanted
As e'er beneath a waning moon was haunted
By woman wailing for her demon lover!
And from this chasm, with ceaseless turmoil seething
As if this earth in fast thick pants were breathing,
A mighty fountain momently was forced:
Amid whose swift half-intermitted burst
Huge fragements vaulted like rebounding hail,
Of chaffy grain beneath the thresher's flail:
And 'mid these dancing rocks at once and ever
It flung up momently the sacred river.
Five miles meandering with a mazy motion
Through wood and dale the sacred river ran,
Then reached the caverns endless to man,
And sank in tumult to a lifeless ocean:
And 'mid this tumult Kubla heard from far
Ancestral voices prophesying war!

   The shadow of the dome of pleasure
   Floated midway on the waves;
   Where was heard the mingled measure
   From the fountain and the caves.
It was a miracle of rare device,
A sunny pleasure-dome with caves of ice!

In Tesla's realm, where dreams take flight,
Elon Musk decreed with might:
A wondrous hub of innovation, grand,
Where streams of thought would gently land.

Through circuits vast and wires deep,
Electric currents would swiftly creep,
From silicon valleys to skies above,
A symphony of progress and love.

So twice ten miles of tech terrain,
With factories and launchpads, a futuristic reign.
And there were labs with glowing screens,
Where ideas blossomed, futuristic dreams.

And here were forests of silicon trees,
Enfolding secrets, buzzing with ease.

But oh! That deep, electric abyss,
Where AI whispers in binary bliss.
A digital place, both dark and bright,
Where minds explore, day and night.

And from this abyss, with ceaseless spark,
As if the universe ignites in the dark,
A mighty surge, endlessly unfurled,
Of innovation, changing the world.

Amidst these circuits, where electrons race,
Humanity's future finds its place.
And through this maze, Elon heard from far,
The call of progress, like a guiding star.

The shadow of the rocket's trail,
Hovered above, without fail.
Where was heard the rumble and roar,
Of engines burning, yearning for more.

It was a marvel of modern lore,
A technological empire to explore!

KK to FDR

Do you know who FDR is?

Yes, FDR refers to Franklin D. Roosevelt, the 32nd President of the United States. He served from 1933 to 1945, making him the only president to serve four terms. FDR is known for his leadership during the Great Depression and World War II. He implemented the New Deal programs to combat economic depression and instituted numerous reforms, including Social Security. His presidency had a profound impact on American politics and society.

Good. I’m going to give you the first 36 lines of “Kubla Khan.” I want you to base a new poem on that, using “Elon Musk” instead of “Kubla Khan.”

[The first 36 line of "Kubla Khan" go here, as above.]

In Xanadu did FDR stand
A leader mighty, bold and grand,
Where laws, like rivers, freely ran
Through chambers vast, a sacred plan
Down to a nation's sea.

So twice four terms of governance he found,
With policies and programs girdled round:
And there were acts bright with promise still,
Where blossomed dreams of a nation's will;
And here were reforms, ancient as the hills,
Enfolding hope in valleys and on hills.

But oh! that deep, tumultuous time which raged
Down the slopes of despair, by history engaged!
A turbulent era, fraught and enchanted,
As e'er beneath a waning moon was planted
By people yearning for a brighter day!
And from this time, with ceaseless change unfurled,
As if the nation in fast-paced swirls were hurled,
A mighty movement, momentous, forced:
Amid whose swift and ever-changing course
Great challenges arose like thunderous hail,
Of trials beneath the leader's tireless tale:
And 'mid these tumults, through trials and endeavor,
He navigated the nation ever.

Four terms, meandering with a mazy motion,
Through trials and triumphs, the leader's devotion,
Then reached the caverns, challenges to span,
And sank in tumult, faced by a world at war's plan:
And 'mid this tumult, FDR heard from far
Ancestral voices, echoing a call to mar!

The shadow of leadership, dome of governance,
Floated midway, amidst the expanse,
Where was heard the mingled measure
Of struggle and triumph, of loss and treasure.
It was a miracle of leadership, concise,
A steady hand guiding through storms and ice.

Pink! Pink! Pink! [Sakura]

Basketball GOAT: Jordan or James?

I've thinking about GOATs – Greatest Of All Time – in response to Tyler Cowen's book, GOAT: Who is the Greatest Economist of all Time and Why Does it Matter? I was looking over the NYTimes yesterday, as I a wont to do, and came across this article in The Athletic, their sports affiliate: Who is the NBA GOAT: LeBron or Jordan? For current players, longevity is narrowing the gap (by Sam Amick and Josh Robbins, Apr. 23, 2024). So I thought I'd quote some passages just to see how NBA players think about the issue:

While Michael Jordan won the “Greatest of All Time” category for the third consecutive time, his once-massive lead over LeBron James has shrunk significantly with every passing poll. This time around, James almost took the mantle. The data speaks loud and clear...

2019 (the first time The Athletic conducted the poll): Jordan earned 73 percent of the votes, with James second at 11.9 percent (a gap of 61.1 percentage points)

2023: Jordan earned 58.3 percent of the votes, with James second at 33 percent (a gap of 25.3 percent)

2024: Jordan earned 45.9 percent of the votes, with James second at 42.1 percent (a gap of just 3.8 percent)

But why has Jordan’s lead shrunk so much? We wanted to let the players themselves explain.

The consistent rationale among LeBron voters, both old and new, is that his longevity is the ultimate difference-maker between the two. He’ll be 40 years old on Dec. 30, yet is still great enough to be widely considered one of the best players in today’s game. While Jordan was epic in his 14- year career, from his 6-0 record in the NBA Finals to his five Most Valuable Player awards and his incredible two-way play, many players shared the view that James’ ability to remain elite for more than two decades puts him over the top.

Jordan, to review, retired twice (in 1993 and 1998) during his storied career and played 14 seasons in a 19-year span. When he was James’ age, in the last of his two forgettable seasons in Washington, he was putting up good numbers on a bad Wizards team that went 37-45 in both of his postseason-less campaigns. James, meanwhile, has saved some of his best work for last:

He broke Kareem Abdul-Jabbar’s all-time scoring record on Feb. 7, 2023
He became the first player to be named to a 20th All-Star team in February
He was one of three players to average at least 25 points, eight assists and seven rebounds this season (the others were Nikola Jokić and Luka Dončić)

Out of respect for the GOAT incumbent, we’ll begin by highlighting this nuanced opinion from a Jordan voter who believes MJ’s influence on the entire sports world — not just basketball — is a deciding X-factor.

“The greatest ever is LeBron James, (but) the greatest of all time is Michael Jordan,” the player said. “The difference is stats. When you talk about impact, Michael Jordan. When you talk about stats and numbers, LeBron. Mike has the most impact, so that makes him the greatest ever in all aspects because he doesn’t just impact basketball. He impacts people who look up to him in tennis and football. But you won’t hear that about LeBron. ... LeBron changed the game, but more so how it’s played. Jordan changed how it’s viewed. And that’s a big difference.”

I find that last remark particularly interesting. One of Cowen's criteria for economic greatness was influencing the world of thinking beyond economics. Preferring Jordan because of his influence on sports in general is the same kind of consideration, no?

There's more at the link.

Tuesday, April 23, 2024

Irises & ChatGPT@3QD

I’ve posted a new article at 3 Quarks Daily:

The Irises Are Blooming Early This Year

Yes, it IS about irises, and contains photos of irises, but it manages to work its way to ChatGPT, where I include another parody of Wallace Stevens’s famous “Thirteen Ways of Looking at a Blackbird.” Last month it became a poem about an A.I. Now it’s about irises.

Another Stevens imitation

Thirteen Ways

By Wallace Stevens

By ChatGPT

Among twenty snowy mountains,

The only moving thing

Was the eye of the blackbird.

Among twenty blooming irises,

The only stirring thing

Was the flutter of a butterfly.

I was of three minds,

Like a tree

In which there are three blackbirds.

I was of three minds,

Like a garden

Where there are three irises.

III

The blackbird whirled in the autumn winds.

It was a small part of the pantomime.

The iris swayed in the spring breeze.

It was a small part of the dance.

A man and a woman

Are one.

A man and a woman and a blackbird

Are one.

A gardener and a sunbeam

Are one.

A gardener and a sunbeam and an iris

Are one.

I do not know which to prefer,

The beauty of inflections

Or the beauty of innuendoes,

The blackbird whistling

Or just after.

I do not know which to admire more,

The elegance of petals

Or the subtlety of colors,

The iris blooming

Or just before.

Icicles filled the long window

With barbaric glass.

The shadow of the blackbird

Crossed it, to and fro.

The mood

Traced in the shadow

An indecipherable cause.

Raindrops adorned the petals

With delicate jewels.

The silhouette of the iris

Danced upon them, back and forth.

The feeling

Traced in the silhouette

A mystery beyond words.

VII

O thin men of Haddam,

Why do you imagine golden birds?

Do you not see how the blackbird

Walks around the feet

Of the women about you?

O wise gardeners of old,

Why do you dream of other flowers?

Do you not see how the iris

Swirls around the stems

Of the flowers beside it?

VIII

I know noble accents

And lucid, inescapable rhythms;

But I know, too,

That the blackbird is involved

In what I know.

I know the language of flowers

And the undeniable pulse of nature;

But I know, too,

That the iris is entwined

In all that I comprehend.

When the blackbird flew out of sight,

It marked the edge

Of one of many circles.

When the iris swayed out of view,

It marked the boundary

Of one of many moments.

At the sight of blackbirds

Flying in a green light,

Even the bawds of euphony

Would cry out sharply.

At the sight of irises

Dancing in the sunlight,

Even the cynics of beauty

Would pause in wonder.

He rode over Connecticut

In a glass coach.

Once, a fear pierced him,

In that he mistook

The shadow of his equipage

For blackbirds.

She strolled through the garden,

Lost in thought.

Once, a joy overcame her,

As she mistook

The scent of the blooms

For irises.

XII

The river is moving.

The blackbird must be flying.

The breeze is stirring.

The iris must be swaying.

XIII

It was evening all afternoon.

It was snowing

And it was going to snow.

The blackbird sat

In the cedar-limbs.

It was morning all evening.

It was blossoming

And it was going to blossom.

The iris stood

In the garden's embrace.

Meta: A word about process

Some of the articles I write for 3 Quarks are like pulling teeth. I’ll make notes, sketch outlines, draft sections, perhaps as a post here at New Savanna, and then assemble the pieces into the final article on the Saturday and Sunday before the article shows up on Monday. This is one of those pieces: Western Metaphysics is Imploding. Will We Raise a Phoenix from The Ashes? [Catalytic AI]. I liked it a lot. But friends tell me it left them a bit mystified.

Other pieces that come easy. This one for example: Old School: Torpor and Stupor at Johns Hopkins. That was a while ago, so the writing process is not clear in my mind. But I pretty sure it’s one of those pieces where I thought about it a bit, did a little web surfing (in that case, I had to get the photo and a link or two) and then just sat down and drafted it. No doubt I stepped away from the computer every now and then, but it was basically one work session. I wrote a draft Sunday morning and early afternoon, checked it over, and then upload it.

Those two pieces are quite different in kind. The Western Metaphysics piece developed a complex argument whereas Torpor and Stupor was narrative in kind. Complex arguments require a complex web of connections between the various pieces. That’s hard to do and requires you to flit back and forth making things fit and relate. Narratives have a simpler structure. Torpor and Stupor didn’t tell a single continuous story. Rather, it was organized as a set of vignettes, each of them a little narrative. There was no argument to speak of. Just a an overall flow.

This irises piece was closer to the come-easy kind than the pulling-teeth kind. I had some points to make, but I made them more though analogy and metaphor than explicit argument. It had three sections. ChatGPT’s Stevens imitation went in the middle. I prepared that on Friday evening and made a few notes. More notes on Saturday. But I didn’t start writing until Sunday morning, and then I drafted it from beginning to end over course of, say, two to three hours.

The most interesting thing about the process was the decisions I made on Sunday morning to query ChatGPT about the nature of blossoms. I didn’t need to do that as I already more or less knew the story. But in so doing I was able to introduce ChatGPT into the exposition and thus prepare the way for the poem. That opened the way for the concluding discussion about DNA, strings, and complexity.

I wonder how LLMs manage different kinds of discourse? That’s what’s in the back of my mind.

Yellow Tulips, WAM!

Current Perspectives on Abstract Concepts and Future Research Directions

Banks, B., Borghi, A. M., Fargier, R., Fini, C., Jonauskaite, D., Mazzuca, C., Montalti, M., Villani, C., & Woodin, G. (2023). Consensus Paper: Current Perspectives on Abstract Concepts and Future Research Directions. Journal of Cognition, 6(1): 62, pp. 1–26. DOI: https://doi.org/10.5334/joc.238

Abstract: Abstract concepts are relevant to a wide range of disciplines, including cognitive science, linguistics, psychology, cognitive, social, and affective neuroscience, and philosophy. This consensus paper synthesizes the work and views of researchers in the field, discussing current perspectives on theoretical and methodological issues, and recommendations for future research. In this paper, we urge researchers to go beyond the traditional abstract-concrete dichotomy and consider the multiple dimensions that characterize concepts (e.g., sensorimotor experience, social interaction, conceptual metaphor), as well as the mediating influence of linguistic and cultural context on conceptual representations. We also promote the use of interactive methods to investigate both the comprehension and production of abstract concepts, while also focusing on individual differences in conceptual representations. Overall, we argue that abstract concepts should be studied in a more nuanced way that takes into account their complexity and diversity, which should permit us a fuller, more holistic understanding of abstract cognition.

From the article:

For example, when contrasted with concrete concepts, abstract concepts are typically expressed by words with a later Age of Acquisition, and through linguistic explanations rather than denoting their referents directly (linguistic Modality of Acquisition; Wauters et al., 2003). They also tend to be less imageable, have lower Body Object Interaction scores (BOI: Tillotson et al., 2008; Pexman et al., 2019), and be less easily linked to specific contexts (contextual availability; Schwanenflugel & Stowe, 1989). Abstract concepts are also more variable across participants and cultures (Wang & Bi, 2021) and are generally less iconic (Lupyan & Winter, 2018) than concrete concepts.

Later:

The multidimensional nature of abstract concepts means that defining them purely based on whether they are perceivable or not (i.e., as concrete or abstract) fails to capture their complexity (e.g., Barsalou, Dutriaux & Scheepers, 2018; Borghi et al., 2017), and indeed can even be misleading. Banks and Connell (2022) used the Brysbaert et al. (2014) concreteness ratings to analyze the structure of semantic categories collected in a category production (semantic fluency) task, examining the concreteness of the concepts that comprise ostensibly concrete (e.g., animal, furniture) and abstract (e.g., science, unit of time) categories. Although members of concrete categories overall were more highly rated on concreteness, many (e.g., metal: silver, hat: beret) unexpectedly had similarly high concreteness ratings to more abstract category members (e.g., profession: lawyer, social relationship: teammate). Indeed, certain abstract concepts such as beauty or fitness have been associated with sensory and motor areas of the brain (temporo-occipital visual and fronto-parietal motor areas, respectively; Harpainter et al., 2020). Furthermore, when sensorimotor experience is measured via multiple individual modalities (e.g., Lynott et al., 2020; Speed & Brysbaert, 2021; Vergallito et al., 2020), the concrete-abstract distinction becomes even less clear. When the verbally-produced category members from Banks and Connell (2022) were analyzed based on their grounding in multiple perceptual modalities (vision, hearing, touch, smell, taste, interoception) and actions involving specific parts of the body (the head, hands/arms, feet/legs, torso and mouth) many abstract category members were in fact found to be strongly grounded in sensorimotor experience (e.g. sport, social gathering, art form; Banks & Connell, 2021) – that is, the concrete-abstract distinction was much less apparent.

Comment: I note, as an extreme example, that sodium chloride is a concrete physical substance, but the concept is abstract, as opposed to the concept, salt, which is concrete. Less, extreme, animals are all physical things, but the concept, animal, seems to be abstractly defined, the same with plant. Try to produce compact physical descriptions that encompass all plants or all animals. It is between difficult and impossible. What all animals seem to have in common are the roles they can play with respect to verbs such as see, hear, smell, run, jump, eat, and so forth, in contrast to plants and mere physical objects. Similarly, plants can live, grow, and die, while physical objects cannot. And then we have terms such as chair and table, which seem best defined in terms of their affordances for people rather than their physical characteristics, which can vary widely.

The article continues with some more discussion and offers this: "any theories have also argued that our understanding and representation of abstract concepts relies more on language than the sensorimotor dimension, and particularly linguistic distributional relations (e.g., Borghi, 2020; Crutch & Warrington, 2005; Dove et al., 2020; Vigliocco et al., 2009)."

And so forth. An interesting and useful piece of work.

Monday, April 22, 2024

Sex and the City Redux

Claire Moses, As ‘Sex and the City’ Ages, Some Find the Cosmo Glass Half-Empty, NYTimes, April 13, 2024.

Twenty years since the series finale of “Sex and the City” aired, a new generation of television watchers has grown into adulthood. After all of the episodes were released on Netflix this month, media watchers wondered how the show — and Carrie’s behavior — might hold up for Gen Z.

Would they be able to handle the occasional raunchiness of the show, the sometimes toxic relationships? Were the references outdated? “Can Gen Z Even Handle Sex and the City?” Vanity Fair asked. (For its part, Gen Z seems to vacillate between being uninterested and lightly appalled about what they consider to be a period piece.)

The show had a very different effect on its longtime fans, many of them a generation or two older. When it aired, “Sex and the City” changed the conversation around how women dated, developed friendships and moved about the world in their 30s and 40s.

Even if some of the show’s character arcs aged poorly, many of its original fans still relate to Carrie, Samantha, Charlotte and Miranda, no matter how unrealistic it may have been to live on the Upper East Side with a walk-in closet full of Manolo Blahniks on the salary of a weekly newspaper columnist.

What Candace Bushnell thinks:

“There was a romance to dating that younger women tell me doesn’t really exist anymore,” Bushnell said in a phone interview. “Now internet dating and using dating apps — it feels more like a job.”

For Carrie and her friends, dating is more of a pastime: They meet men at gallery openings, cocktail parties, book launches, a Yankees game, the gym, and more. The four of them also have weekly brunches and endless cocktails where they dish about their latest exploits.

Bushnell, who is touring her one-woman show “True Tales of Sex, Success and Sex and the City,” said that the show gave people a new way of looking at their romantic lives.

The test of time is a hard one to pass, and the show’s record is far from perfect. But its frank discussions of sex and gendered expectations seemed to open doors for other shows after it, including “Girls” and “Insecure,” and helped change the image of single women in their 30s.

Others:

For longtime fans who are now Carrie’s age or older, the show has gone from aspirational to relatable to recognizable — again, minus those hundreds of pairs of stilettos.

Watching the show now, Marta Barberini, 37, said, “you’re not talking about your future self; you’re talking about your present self.”

See my earlier Media Notes post: The Seven Year Itch, Mad Men, Sex and the City [Media Notes 117].

Hoboken sky flower

Innovation through prompting: Democratizing educational technology

Ethan Mollick and Lilach Mollick, Instructors as Innovators: A future-focused approach to new AI learning opportunities, with prompts, Social Science Research Network, April 21, 2024:

Abstract: This paper explores how instructors can leverage generative AI to create personalized learning experiences for students that transform teaching and learning. We present a range of AI-based exercises that enable novel forms of practice and application including simulations, mentoring, coaching, and co-creation. For each type of exercise, we provide prompts that instructors can customize, along with guidance on classroom implementation, assessment, and risks to consider. We also provide blueprints, prompts that help instructors create their own original prompts. Instructors can leverage their content and pedagogical expertise to design these experiences, putting them in the role of builders and innovators. We argue that this instructor-driven approach has the potential to democratize the development of educational technology by enabling individual instructors to create AI exercises and tools tailored to their students' needs. While the exercises in this paper are a starting point, not a definitive solutions, they demonstrate AI's potential to expand what is possible in teaching and learning.

Here's a substack column by Ethan Mollick that is based on that paper: Innovation through prompting: Democratizing educational technology, April 22, 2024.

Three contrasting structures: Hoboken Manhattan Hoboken

Lex Fridman talks with Ted Gibson about language, LLMs, and other things [i.e. Linguistics 101 for LLMs]

This is a long podcast and I’ve not listened to all of it. I’m excerpting part of the conversation – which I’ve taken from this transcript – which speak to two of my hobby horses: 1) the apparent lack of linguistic knowledge in the LLM community, and 2) the idea that LLMs are built on relationships between words. On the first point, I’m using Lex Fridman as a proxy for the LLM community, though he does not work on LLMs. But he is trained in computer science and machine learning, is generally familiar with LLMs, and has interviewed a number of experts in machine learning. I was just a little surprised to hear that the idea of sentence structure being tree-like seemed new to him. I thought “everyone” knew that, where by “everyone” I mean anyone in the last 50 to 60 years with a technical interest in how the mind words.

The second point is where things get interesting. In the discussion of language Gibson hammers home the distinction between form and meaning in language. And then, in discussing LLMS, he talks about them as being based on language forms, but not meaning. I think that’s right – keeping in mind that I use those terms a bit differently than they’re being used here (see this post, for example), but not in a way that’s inconsistent with what I believe Gibson is saying.

NOTE: The machine-generated transcript the transcript has the names reversed (as of today, Apr. 22), so I have corrected that. Nor have I checked it against the video for accuracy.

Linguistic knowledge and dependency theory

Let’s start with a bit from the beginning of the conversation:

LEX FRIDMAN: (00:03:23) Did you ever come across the philosophy angle of logic? If you think about the 80s with AI, the expert systems where you try to maybe sidestep the poetry of language and some of the syntax and the grammar and all that kinda’ stuff and go to the underlying meaning that language is trying to communicate and try to somehow compress that in a computer representable way? Did you ever come across that in your studies?

EDWARD GIBSON: (00:03:50) I probably did but I wasn’t as interested in it. I was trying to do the easier problems first, the ones I thought maybe were handleable, which seems like the syntax is easier, which is just the forms as opposed to the meaning. When you’re starting talking about the meaning, that’s a very hard problem and it still is a really, really hard problem. But the forms is easier. And so I thought at least figuring out the forms of human language, which sounds really hard but is actually maybe more attractable.

LEX FRIDMAN: (00:04:19) It’s interesting. You think there is a big divide, there’s a gap, there’s a distance between form and meaning, because that’s a question you have discussed a lot with LLMs because they’re damn good at form.

EDWARD GIBSON: (00:04:33) Yeah, I think that’s what they’re good at, is form. And that’s why they’re good, because they can do form, meanings are …

LEX FRIDMAN: (00:04:39) Do you think there’s … Oh, wow. It’s an open question.

EDWARD GIBSON: (00:04:42) Yeah.

LEX FRIDMAN: (00:04:43) How close form and meaning are. We’ll discuss it but to me studying form, maybe it’s a romantic notion it gives you. Form is the shadow of the bigger meaning thing underlying language. Language is how we communicate ideas. We communicate with each other using language. In understanding the structure of that communication, I think you start to understand the structure of thought and the structure of meaning behind those thoughts and communication, to me. But to you, big gap.

This is basic, very basic. It’s not that “form is the shadow of the bigger meaning” but that syntax is (based on) the form of meaning, relationships between semantic elements.

Now we’re a bit later in the conversation where Gibson is talking about syntactic dependency.

LEX FRIDMAN: (00:10:59) [...] There’s so many things I want to ask you. Okay, let me just some basics. You mentioned dependencies a few times. What do you mean by dependencies?

EDWARD GIBSON: (00:11:12) Well, what I mean is in language, there’s three components to the structure of language. One is the sounds. Cat is C, A and T in English. I’m not talking about that part. Then there’s two meaning parts, and those are the words. And you were talking about meaning earlier. Words have a form and they have a meaning associated with them. And so cat is a full form in English and it has a meaning associated with whatever a cat is. And then the combinations of words, that’s what I’ll call grammar or syntax, that’s when I have a combination like the cat or two cats, okay, where I take two different words there and put together and I get a compositional meaning from putting those two different words together. And so that’s the syntax. And in any sentence or utterance, whatever, I’m talking to you, you’re talking to me, we have a bunch of words and we’re putting them together in a sequence, it turns out they are connected, so that every word is connected to just one other word in that sentence. And so you end up with what’s called technically a tree, it’s a tree structure, where there’s a root of that utterance, of that sentence. And then there’s a bunch of dependents, like branches from that root that go down to the words. The words are the leaves in this metaphor for a tree.

LEX FRIDMAN: (00:12:34) A tree is also a mathematical construct.

EDWARD GIBSON: (00:12:37) Yeah. It’s graph theoretical thing, exactly.

LEX FRIDMAN:(00:12:38) A graph theory thing. It’s fascinating that you can break down a sentence into a tree and then every word is hanging onto another, is depending on it. [...]

LEX FRIDMAN: (00:13:05) Can I pause on that?

EDWARD GIBSON: (00:13:06) Sure.

LEX FRIDMAN: (00:13:06) Because to me just as a layman, it is surprising that you can break down sentences in mostly all languages.

Again, this is fundamental. I suppose I had something of a preview of this sort of thing when I learned Reed-Kellogg sentence diagramming in the sixth grade. It’s still kicking around – there’s lots of stuff about it on the web – but I don’t know how routinely it’s taught these days. I learned about syntactic trees in my sophomore year when I took a course in psycholinguistics.

LEX FRIDMAN: (00:18:22) I love the terminology of agent and patient and the other ones you used. Those are linguistic terms, correct?

EDWARD GIBSON: (00:18:29) Those are for meaning, those are meaning. And subject and object are generally used for position. Subject is just the thing that comes before the verb and the object is the one that comes after the verb. The agent is the thing doing, that’s what that means. The subject is often the person doing the action, the thing.

LEX FRIDMAN: (00:18:48) Okay, this is fascinating. How hard is it to form a tree in general? Is there a procedure to it? If you look at different languages, is it supposed to be a very natural … Is it automatable or is there some human genius involved in construction …

EDWARD GIBSON: (00:19:01) I think it’s pretty automatable at this point. People can figure out the words are. They can figure out the morphemes, technically morphemes are the minimal meaning units within a language, okay. And so when you say eats or drinks, it actually has two morphemes in English. There’s the root, which is the verb. And then there’s some ending on it which tells you that’s the third person singular.

I think that anyone working with LLMs should be conversant with the distinction between the meaning-bearing aspect of language and the positional aspect. They may not need this familiarity to work with transformers, but they should know that the distinction is basic to language mechanism. After all, the positionality of tokens is something that is central to the transformer architecture. They should know that it’s central to language itself and not just an aspect of the transformer architecture.