less_retarded_wiki/human_language.md

33 lines
15 KiB
Markdown
Raw Permalink Normal View History

2024-04-13 22:42:40 +02:00
# Human Language
2024-07-30 22:52:22 +02:00
Human language is language used mostly by [humans](human.md) to communicate with each other; these languages are very hard to handle by [computers](computer.md) (only quite recently [neural network](neural_net.md) computer programs became able to show true understanding of human language). They are studies by [linguists](linguistics.md). It is estimated (very roughly) that there are about 5000 human languages. Human languages are most commonly **natural languages**, i.e. ones that evolved naturally over many centuries such as [English](english.md), [Chinese](chinese.md), French or [Latin](latin.md), but there also exist a great number of so called **[constructed languages](conlang.md)** (*conlangs*), i.e. artificially made ones such as [Esperanto](esperanto.md), Interslavic or [Lojban](lojban.md). But all of these are still human languages, different from e.g. [computer languages](computer_language.md) such [C](c.md) or [XML](xml.md). Natural human languages practically always show significant irregularities (exceptions to general rules) while constructed languages typically try to eliminate irregularities as much as possible so as to make them easier to learn, but even a constructed human language is still extremely difficult for a computer to understand.
2024-04-13 22:42:40 +02:00
2024-04-25 20:58:19 +02:00
Human language is a social construct so according to [pseudoleftists](pseudoleft.md) it's an illusion, doesn't exist, doesn't work and has no significance.
2024-10-07 14:24:18 +02:00
Languages are sadly often what easily divides people into groups and so fuels [fascism](fascism.md), specifically [nationalism](nationalism.md).
2024-04-17 19:56:46 +02:00
**Why are human languages so hard for computers to handle?** Well, firstly there are minor annoyances like syntactic ambiguity, irregularities, redundancy, complex rules of grammar -- for example the sentence "I know Bob likes computers, and so does John." can either mean that John knows that Bob likes computers or that both Bob and John like computers. Things like this can be addressed by designing the [grammar](grammar.md) unambiguously, but analyzing already existing natural languages suffers by this. Furthermore in real life there are countless quirks of playing with language, things like sacrasm, parody, exaggerations, indirect hints, politeness, rhetorical questions, fau pax, memes and references. For example when we think of imperative, we imagine sentences such as "Close the window." -- in real life we'll rather say something like "I'm cold, it wouldn't hurt to close the window.", i.e. something that's semantically an imperative but not syntactically, a dumb computer would deduce here we are stating a fact that closing the window will not hurt anyone; it takes human-like intelligence AND experience in how the real life works and abilities like being able to guess feelings and plans of others to correctly conclude this sentence in fact means "Please close the window." Just try to talk to someone for a while and focus on what the sentences mean literally and what they actually imply. So things revolving around this are pose the first issue, but yet a greater issue dwells in how to actually define meanings of words -- human language is not just "text strings" as it might seem on the first glance, behind the text strings lies a deep understanding of the extremely complex [real world](irl.md). More details of the issues of semantic will be given below.
2024-04-20 14:23:58 +02:00
**What is the most [LRS](lrs.md) human language?** This is not [settled](settled.md) yet but [Esparanto](esperanto.md) looks pretty cool. [English](english.md) is actually one of the most [suckless](suckless.md) languages, it's extremely easy and everyone speaks it -- it's not perfect but it is like [C](c.md) in programming, likely the best things we probably have at the moment. As a part of [less retarded society](less_retarded_society.md) we should aim to create a constructed language that will be universally spoken by everyone and which, if at all possible, will solve the issue of the great language curse described below.
2024-04-15 17:27:45 +02:00
## The Grand Curse Of Human Language
2024-04-13 22:42:40 +02:00
{ The following is a thought dump made without much research, please inform me if you're a linguist or something and have something enlightening to say, thank you <3 ~drummyfish }
2024-04-15 17:27:45 +02:00
On one hand human languages are cool when viewed from cultural or [artistic](art.md) perspective, they allow us to write poetry, describe feelings and nature around us -- in this way they can be considered [beautiful](beauty.md). However from the perspective of others, e.g. programmers or historians, **human languages are a [nightmare](nightmare.md)**. There is unfortunately an **enormous, inherent curse connected to any human language**, both natural or constructed, that comes from its inevitably [fuzzy](fuzzy.md) nature stemming from fuzziness or real life concepts, it's the problem of **defining [semantics](semantics.md)** of words and constructs (no, Lojban doesn't solve this). [Syntax](syntax.md) (i.e. the rules that say which sentences are valid and which are not) doesn't pose such a problem, we can quite easily define what's grammatically correct or not (it's not as hard to write a program that checks gramatical correctness), it is semantics (i.e. meanings) that is extremely hard to grasp -- even in rigorous languages (such as mathematical notation or programming languages) semantics is a bit harder to define (quite often still relying on bits of human language), but while in a programming language we are essentially able to define quite EXACTLY what each construct means (e.g. `a + b` returns the sum of values *a* and *b*), in a natural language we are basically never able to do that, we can only ever form fuzzy connections between other fuzzy concepts and we can never have anything fixed.
2024-06-07 16:46:05 +02:00
Due to this fuzziness human languages inevitably change over time no matter how hard we try to counter this, any text written a few thousand years ago is nowadays very hard to understand -- not because the old languages aren't spoken anymore, but because the original meanings of specific words, phrases and constructs are distorted by time; when learning an old language we learn what each word meant by reading its translation to some modern word, but the modern word is always more or less different. Even if it's a very simple word such as "fish", our modern word for fish means a slightly different thing than let's say ancient Roman's word for fish because it had slightly different connotations such as potential references to other things: fish for example used to be the symbol of Christianity, nowadays people don't even commonly make this connection. Fishermen were a despised class of workers, to some fish may have signified food and abundance, to others something that "smells bad", to others something or someone who's "slippery". Some words may have referred to some contemporary "[meme](meme.md)" that's been long forgotten and if some text makes the reference, we won't understand it. The word "book" for example meant something a bit different 2000 years ago than it means now: back then a book might have been just a relatively short scroll, it was expensive and people didn't read books the same way as we do today, they commonly just read them out loud to others, so "reading a book" and the word "book" itself doesn't conjure the same picture in our heads as it did back then. Or another example showing the difference between languages existing at the same time is this: while the Spanish word "perro" translates to English as "dog", the meanings aren't the same; some English speakers use the word as a synonym for "friend" but in Spanish the word can be used as an insult so shouting "perro" and "dog" in the street may lead to different (possibly completely opposite) images popping up in the heads of those who hear it. How do you describe a word precisely if you can only describe it with other imprecise words that are changing constantly? No, not even pictures will help -- if you attach the picture of a cat to the word "cat", it's still not clear what it means -- does it stand for the picture of the cat or for the cat that's in the picture, does it stand ONLY for the one cat that's in the picture or all other animals that are similar to the one in the picture? How similar? Is lion a cat? Is a toy cat or cartoon cat a cat? Or does the picture signify that anything with a fur is a cat? If it looks like cat but walks on two legs and speaks, is it still a cat? Now imagine describing a more abstract term such as *thought*, *number* or *existence*. There is no solid ground, even such essential words as "to want" or "to be" have different meanings between languages ("to be" can stand for "to exist", "to be in a place", "to temporarily have some property", "to permanently have some property" etc.). Even dictionaries admit defeat and are happy with having circular definitions because there aren't any foundations to build upon, circular definitions are inevitable, dictionaries just help you connect fuzzy concepts together. All of this extends to tenses, moods, cases and everything else. This can be very well seen e.g. with people interpreting old texts such as the Bible, for example some say [Jesus](jesus.md) claimed to be the son of God while others reject it, saying that even if he stated the sentence, it actually wasn't meant literally as it was a commonly used phrase that meant something else -- these people will argue about everything and they can comfortably interpret the same text in completely opposite ways. The point is that we just can't know.
2024-04-15 17:27:45 +02:00
2024-05-05 23:13:54 +02:00
{ Just one more of other countless examples I recently encountered: it used to be generally believed that [Jesus](jesus.md) was crucified so that he was nailed on the cross through his palms, however it was shown this wouldn't work and also other evidence showed people were nailed more in the arms, in a way that would hold the weight of the body but wouldn't hit the artery. The confusion came from translation -- the Greek word for "hand" also includes part of an arm, i.e. the word for hand in Greek is different from the word hand in some other languages. ~drummyfish }
2024-04-15 17:27:45 +02:00
In addition there are ALWAYS great many hidden implicit assumptions that both communicating sides have to share to be able to communicate (and these can only be assured by many years of learning, spent in the same environment) -- for example if I tell someone "Drive to the city and buy food.", in fact I mean something like "Right now walk with your feet to our car, open the door, sit in, take the wheel in your hands, start the car, drive only on the road with your eyes open, ..."; the guy can technically satisfy my order by waiting 10 years, then driving a truck through forests with eyes closed over the whole globe and back. Just as it's impossible to perfectly define all words, it is impossible to explicitly recount all assumptions. Though the mentioned example is exaggerated, it shows an ever present phenomenon we have to deal with, a phenomenon which can cause misunderstanding or be easily abused.
2024-04-13 22:42:40 +02:00
2024-06-07 16:46:05 +02:00
Of course this barrier exists between contemporary languages too, the idiom "lost in translation" exists for a reason -- translating something always loses or at least changes something. Translating one sentence over and over to different languages and back to the original one will most likely produce a sentence with very distinct meaning.
2024-04-15 17:27:45 +02:00
This is the grand issue that common people almost universally overlook, most will naively think that with careful effort it is possible to express oneself so clearly that others simply won't be able to misunderstand -- this is sadly false, even with most carefully crafted sentences language always extremely easily allows any word to be twisted by politicians to anything they want, it destroys old knowledge and prevents us from communicating with clarity and recording ideas so that they would last into the future. This damnation of language plagues every book, authors constantly complain "I should have rather used this and that word" but that wouldn't even help, it's impossible to say something so as to not be misunderstood because human language is a weak, crippled tool just based on shouting weird sounds in hopes someone will get a vague idea of what's going on in your head. Due to this limitation of language it is absolutely worthless to discuss anything if after 5 minutes you don't come to agreement, the discussion will lead nowhere, it's best to just leave it at communication being impossible because even if linguistically you speak the same language, you cannot communicate correct meanings, even words like "is", "when", "bad" or "will" will have absolutely different meanings, you would have to define every word of every sentence and then every word of every new sentence you produce for 1000 years until you come to circular definitions when you'll still be disagreeing but won't even be able to waste time further.
2024-04-13 22:42:40 +02:00
2024-04-15 17:27:45 +02:00
This issue is very hard to solve, maybe impossible. It seems that due to the extreme complexity of [real life](irl.md) our language can't operate with precise equations but rather has to settle with concepts that are just fuzzy blobs that our brains -- [neural networks](neural_net.md) in our heads -- learn by trial and error over many years. We learn that if we hear the word *X*, it's best to react by feeling fear or turning our head or closing our eyes etc.
2024-04-13 22:42:40 +02:00
{ The only idea of a solution on how to make a "mathematically precise" human language for real world communication is the following. Firstly make a mathematical model of some artificial world that's similar to ours, for simplicity we can now just consider something like a 2D grid with differently colored cells, i.e. something like a [cellular automaton](cellular_automaton.md). The world changes in steps and each cell can "talk", i.e. at any frame it can emit a text string. Now make a language that's precisely defined in this world; if the world is simple, it's pretty doable e.g. like this: write a function in some programming language that takes the world and check if what the cells are saying classifies as your language used in a correct way within this world (so the function just returns *true/false*, nothing else is needed). Now this single function mathematically defines your language -- by looking at your function's source code anyone can derive the absolutely correct meaning of any word or sentence because he can see how the function checks whether that word of phrase is used correctly, he will know exactly which situations fit given sentence and which don't. Now the final step is only to find correspondence between the real life and your simplified mathematical world, e.g. that cells represent humans and so on (but this will have shortcomings, e.g. our simple world will make it difficult or impossible to talk about body parts since cells have none; also making the connection between the mathematical world and real world relies on intuition). ~drummyfish }
2024-05-05 23:13:54 +02:00
{ Yet another, maybe more practical idea would be to create a set of very few core words -- let's say 100, which we would try to define extremely precisely by all the current imperfect means but with very elevated effort, i.e. each word would have a detailed description, translations to 20 other natural languages, positive and negative examples, pictures attached etc. Then the rest of the language would be defined only using these core words. But maybe it wouldn't work -- the language would be possibly a bit more stable but would eventually degenerate as well. ~drummyfish }