11 KiB
Human Language
Human language is language used mostly by humans to communicate with each other; these languages are very hard to handle by computers (only quite recently neural network computer programs became able to show true understanding of human language). They are studies by linguists. Human languages are most commonly natural languages, i.e. ones that evolved naturally over many centuries such as English, Chinese, French or Latin, but there also exist a great number of so called constructed languages (conlangs), i.e. artificially made ones such as Esperanto, Interslavic or Lojban. But all of these are still human languages, different from e.g. computer languages such C or XML. Natural human languages practically always show significant irregularities (exceptions to general rules) while constructed languages typically try to eliminate irregularities as much as possible so as to make them easier to learn, but even a constructed human language is still extremely difficult for a computer to understand.
The Grand Curse Of Human Language
{ The following is a thought dump made without much research, please inform me if you're a linguist or something and have something enlightening to say, thank you <3 ~drummyfish }
On one hand human languages are cool when viewed from cultural or artistic perspective, they allow us to write poetry, describe feelings and nature around us -- in this way they can be considered beautiful. However from the perspective of others, e.g. programmers or historians, human languages are a nightmare. There is unfortunately an enormous, inherent curse connected to any human language, both natural or constructed, that comes from its inevitably fuzzy nature stemming from fuzziness or real life concepts, it's the problem of defining semantics of words and constructs (no, Lojban doesn't solve this). Syntax (i.e. the rules that say which sentences are valid and which are not) doesn't pose such a problem, we can quite easily define what's grammatically correct or not (it's not as hard to write a program that checks gramatical correctness), it is semantics (i.e. meanings) that is extremely hard to grasp -- even in rigorous languages (such as mathematical notation or programming languages) semantics is a bit harder to define (quite often still relying on bits of human language), but while in a programming language we are essentially able to define quite EXACTLY what each construct means (e.g. a + b
returns the sum of values a and b), in a natural language we are basically never able to do that, we can only ever form fuzzy connections between other fuzzy concepts and we can never have anything fixed.
Due to this fuzziness human languages inevitably change over time no matter how hard we try to counter this, any text written a few thousand years ago is nowadays very hard to understand -- not because the old languages aren't spoken anymore, but because the original meanings of specific words, phrases and constructs are distroted by time; when learning an old language we learn what each word meant by reading its translation to some modern word, but the modern word is always more or less different. Even if it's a very simple word such as "fish", our modern word for fish means a slightly different thing than let's say ancient Roman's word for fish because it had slightly different connotations such as potential references to other things: fish for example used to be the symbol of Christianity, nowadays people don't even commonly make this connection. Fishermen were a despised class of workers, to some fish may have signified food and abundance, to others something that "smells bad", to others something or someone who's "slippery". Some words may have referred to some contemporary "meme" that's been long forgotten and if some text makes the reference, we won't understand it. While the Spanish word "perro" translates to English as "dog", the meanings aren't the same; English speaking gangsters use the word as a synonym for "friend" but in Spanish the word can be used as an insult so shouting "perro" and "dog" in the street may lead to different images popping up in the heads of those who hear it. How do you describe a word precisely if you can only desribe it with other imprecise words that are changing constantly? No, not even pictures will help -- if you attach the picture of a cat to the word "cat", it's still not clear what it means -- does it stand for the picture of the cat or for the cat that's in the picture, does it stand ONLY for the one cat that's in the picture or all other animals that are similar to the one in the picture? How similar? Is lion a cat? Is a toy cat or cartoon cat a cat? Or does the picture signify that anything with a fur is a cat? If it looks like cat but walks on two legs and speaks, is it still a cat? Now imagine describing a more abstract term such as thought, number or existence. There is no solid ground, even such essential words as "to want" or "to be" have different meanings between languages ("to be" can stand for "to exist", "to be in a place", "to temporarily have some property", "to permanently have some property" etc.). Even dictionaries admit defeat and are happy with having circular definitions because there aren't any foundations to build upon, circular definitions are inevitable, dictionaries just help you connect fuzzy concepts together. All of this extends to tenses, moods, cases and everything else. This can be very well seen e.g. with people interpreting old texts such as the Bible, for example some say Jesus claimed to be the son of God while others reject it, saying that even if he stated the sentence, it actually wasn't meant literally as it was a commonly used phrase that meant something else -- these people will argue about everything and they can comfortably interpret the same text in completely opposite ways. The point is that we just can't know.
In addition there are ALWAYS great many hidden implicit assumptions that both communicating sides have to share to be able to communicate (and these can only be assured by many years of learning, spent in the same environment) -- for example if I tell someone "Drive to the city and buy food.", in fact I mean something like "Right now walk with your feet to our car, open the door, sit in, take the wheel in your hands, start the car, drive only on the road with your eyes open, ..."; the guy can technically satisfy my order by waiting 10 years, then driving a truck through forests with eyes closed over the whole globe and back. Just as it's impossible to perfectly define all words, it is impossible to explicitly recount all assumptions. Though the mentioned example is exaggerated, it shows an ever present phenomenon we have to deal with, a phenomenon which can cause misunderstanding or be easily abused.
This is the grand issue that common people almost universally overlook, most will naively think that with careful effort it is possible to express oneself so clearly that others simply won't be able to misunderstand -- this is sadly false, even with most carefully crafted sentences language always extremely easily allows any word to be twisted by politicians to anything they want, it destroys old knowledge and prevents us from communicating with clarity and recording ideas so that they would last into the future. This damnation of language plagues every book, authors constantly complain "I should have rather used this and that word" but that wouldn't even help, it's impossible to say something so as to not be misunderstood because human language is a weak, crippled tool just based on shouting weird sounds in hopes someone will get a vague idea of what's going on in your head. Due to this limitation of language it is absolutely worthless to discuss anything if after 5 minutes you don't come to agreement, the discussion will lead nowhere, it's best to just leave it at communication being impossible because even if linguistically you speak the same language, you cannot communicate correct meanings, even words like "is", "when", "bad" or "will" will have absolutely different meanings, you would have to define every word of every sentence and then every word of every new sentence you produce for 1000 years until you come to circular definitions when you'll still be disagreeing but won't even be able to waste time further.
This issue is very hard to solve, maybe impossible. It seems that due to the extreme complexity of real life our language can't operate with precise equations but rather has to settle with concepts that are just fuzzy blobs that our brains -- neural networks in our heads -- learn by trial and error over many years. We learn that if we hear the word X, it's best to react by feeling fear or turning our head or closing our eyes etc.
{ The only idea of a solution on how to make a "mathematically precise" human language for real world communication is the following. Firstly make a mathematical model of some artificial world that's similar to ours, for simplicity we can now just consider something like a 2D grid with differently colored cells, i.e. something like a cellular automaton. The world changes in steps and each cell can "talk", i.e. at any frame it can emit a text string. Now make a language that's precisely defined in this world; if the world is simple, it's pretty doable e.g. like this: write a function in some programming language that takes the world and check if what the cells are saying classifies as your language used in a correct way within this world (so the function just returns true/false, nothing else is needed). Now this single function mathematically defines your language -- by looking at your function's source code anyone can derive the absolutely correct meaning of any word or sentence because he can see how the function checks whether that word of phrase is used correctly, he will know exactly which situations fit given sentence and which don't. Now the final step is only to find correspondence between the real life and your simplified mathematical world, e.g. that cells represent humans and so on (but this will have shortcomings, e.g. our simple world will make it difficult or impossible to talk about body parts since cells have none; also making the connection between the mathematical world and real world relies on intuition). ~drummyfish }
{ Yet another, maybe more practical idea would be to create a set of very few core words -- let's say 100, which we would try to define extremely precisely by all the current imperfect means but with very elevated effort, i.e. each word would have a detailed description, translations to 20 other natural languages, positive and negative examples, pictures attached etc. Then the rest of the language would be defined only using these core words. But maybe it wouldn't work -- the language would be possibly a bit more stable but would eventually degenerate as well. ~drummyfish }