Update

2025-05-03 16:21:08 +02:00 · 2025-05-03 16:21:08 +02:00 · 783a41a7cf
commit 783a41a7cf
parent 46a27e1930
18 changed files with 2157 additions and 2060 deletions
--- a/netstalking.md
+++ b/netstalking.md
@ -1,6 +1,8 @@
 # Netstalking

-Netstalking means searching for obscure, hard-to-find and somehow valuable (even if only by its entertaining nature) [information](information.md)/media buried in the depths of the [Internet](internet.md) (and similar networks), for example searching for funny photos on Google Streetview (https://9-eyes.com/), unindexed [deepweb](deepweb.md) sites or secret documents on [FTP](ftp.md) servers. Netstalking is relatively unknown in the [English](english.md)-speaking world but is pretty popular in Russian communities, although since the beginning of 2020s the general interest in obscure and esoteric material on the Internet seems to have been steadily rising among all inhabitants of the world wide network, perhaps due to other phenomena such as increasing [censorship](censorship.md) (and the desire to bypass it), the "web 1.0 revival" movement etc.
+*Not to be confused with "[stalking](stalking.md)".*
+
+Netstalking (reference to the [game](game.md) S.T.A.L.K.E.R.) means searching for obscure, hard-to-find and somehow valuable (even if only by its entertaining nature) [information](information.md)/media buried in the depths of the [Internet](internet.md) (and similar networks), for example searching for funny photos on Google Streetview (https://9-eyes.com/), unindexed [deepweb](deepweb.md) sites or secret documents on [FTP](ftp.md) servers. The activity is distinct from [cracking](cracking.md) (breaking into protected systems), it only involves searching and observing. Netstalking is relatively unknown in the [English](english.md)-speaking world but is pretty popular in Russian communities, although since the beginning of 2020s the general interest in obscure and esoteric material on the Internet seems to have been steadily rising among all inhabitants of the world wide network, perhaps due to other phenomena such as increasing [censorship](censorship.md) (and the desire to bypass it), the "web 1.0 revival" movement etc.

 Netstalking can be divided into two categories:

@ -16,12 +18,12 @@ Techniques of netstalking include port scanning, randomly generating web domains
  - `"exact phrase"`: Searches only for a verbatim string, very useful e.g. for searching exact filenames and exploiting tricks such as for example searching a long phrase from a publicly inaccessible book to find websites that in fact have such books publicly accessible. Another trick is to search for something like `"powered by gitea"` (or whatever framework) or `"index of"` (common heading of plain file lists) -- this can find small and unadvertised sites running on popular [frameworks](framework.md).
  - `before:year`: Limits the search to sites/files published before given year. This is amazingly useful as nowadays everything is just flooded by [AI](ai.md) garbage and commercial, censored [noise](noise.md). Adding `before:2010` just takes you back to the old world where Internet actually contained useful information, where schools for instance weren't afraid to list names of all pupils in each class along with photos, names of their teachers and so on.
  - `filetype:type`: Searches only for files of given type. Again, this is very abusable -- you may for example search for Excel spreadsheets (`filetype:xls`), [JSON](json.md) or [CSV](csv.md) databases and so on -- there are tons and tons of sheets with personal information of company employees, taxes and various other sensitive stuff. Searching for MS Word or PowerPoint documents finds files created by people who aren't very skilled with computers and will very likely post some crazy [shit](shit.md) :-) If you're feeling lucky, try to search databases of passwords in plain text.
- **Search non-web networks.** Web is very much controlled and polices now, but other networks are either designed to be uncontrollable and/or are so underground that no one cares to "[moderate](moderation.md)" it. These networks include for example [Tor](tor.md), [I2P](i2p.md), [Freenet](freenet.md), [gopher](gopher.md), [gemini](gemini.md), [WAP](wap.md), [FTP](ftp.md), [Usenet](usenet.md), Guifi (and other wifi networks), [torrents](torrent.md), etc. Also try to search [IRC](irc.md) chat logs and whatever.
+- **Search non-web networks.** Web is very much controlled and policed now, but other networks are either designed to be uncontrollable and/or are so underground that no one cares to "[moderate](moderation.md)" it. These networks include for example [Tor](tor.md), [I2P](i2p.md), [Freenet](freenet.md), [gopher](gopher.md), [gemini](gemini.md), [WAP](wap.md), [FTP](ftp.md), [Usenet](usenet.md), Guifi (and other wifi networks), [torrents](torrent.md), etc. Also try to search [IRC](irc.md) chat logs and whatever.
 - **Search ban lists ("blacklists", "blocklists", "isolation lists", ...).** A trick to finding censored material is to look for a list of the censored stuff -- [FOSS](foss.md) projects (like [Fediverse](fediverse.md)) typically have such lists publicly available as part of their "openness and collaboration".
 - **Look for OSINT tools.** OSINT means "open source intelligence", basically digging out info from publicly available sources. This leads to finding amazing tools, for example there exists an AI-powered face search engine that takes a photo of a face and returns images from all over the Internet where that face appears. Works like a charm.
 - **Reverse search for obscure/shady/topic related material.** Another cool trick to finding weird sites, or ones related to a very specific topic, is to look for sites that link to already known weird/banned/obscure/topic related stuff. For example searching for sites that link to [Encyclopedia Dramatica](dramatica.md) brings up a promising list of places to check out when looking for uncensored, [SJW](sjw.md)-free places. Similarly you can search for sites that use forbidden words ([nigger](nigger.md), [faggot](faggot.md), ...), images (goatse, gore, FACES of CP stars, ...), very niche terms (e.g. [bitreich](bitreich.md)), "legally problematic" stuff (leaked photos, shooter manifestos, ...) etc.
 - **Search in other [languages](human_language.md).** If you're not a native English speaker, you probably know that your country's web contains some cool stuff that's missing from the English web. Due to many factors such as [cultural](culture.md) differences and different political interests (i.e. kinds of censorship and propaganda) some tidbit of trivia will only be found on non-English sites -- Russian, Spanish, Chinese and Japanese websites are a whole new world. Machine translate of the sites is often more than enough to understand the text.
- **Search archives.** The Internet Archive is the giant among archives that must always be checked, but don't forget smaller ones either, like archive.li, [Usenet](usenet.md) archives, [4chan](4chan.md) archives etc. You'll be able to find stuff that's now gone from the Internet and/or got hidden.
+- **Search archives, file hosting servers etc.** The Internet Archive is the giant among archives that must always be checked, but don't forget smaller ones either, like archive.li, [Usenet](usenet.md) archives, [4chan](4chan.md) archives, various file pastebins etc. You may be able to find stuff that's now gone from the Internet and/or got hidden.
 - **Guess randomly.** It can even be an entertaining pastime to play a lottery, randomly digging and seeing what you find. For example you can type random domains or IP addresses in your URL bar: `nigger.com`, `hitler.il`, `weirdporn.xyz` or whatever. One can even quite effortlessly bash together a script to automatically check millions of such domains. This has a chance of discovering something that would be otherwise unfindable because it's not linked to from anywhere on the indexed web.
 - **Manually search unindexable material**. A lot of information is out there but search engines don't know about it because it's not in plaintext format or it's hiding behind a login or captcha wall or whatever. Plenty of stuff is hidden in scanned PDF books, videos, compressed archives, spoken audio etc. Hence when you're searching manually, try to go to places where search engines are less likely to get.
 - **Write own tools.** Today you no longer have to possess a [PhD](phd.md) (or even brain) to write a simple web scraping script. Custom tools can take you beyond what search engines can (and are willing to) do for you -- for example search engines typically can't search for [regular expressions](regexp.md), but your own crawler can. Your own tool is 100% tailored to your needs, it can behave in exact ways you want (ignore robots.txt, use your credentials to bypass login walls, follow very specific trails, you can even use [OCR](ocr.md) to extract text from images etc.). Like said above, a simple tool is for example one that randomly checks various combinations of words and TLDs to discover curious domain names. Writing a simple crawler is also pretty easy, provided you [keep it very simple](kiss.md) -- exploit existing tools like wget or curl to download pages and extract everything that looks like URL, no need to parse [HTML](html.md) or whatever, literally treat everything as plain text. Then you can extract only documents that are somehow "[interesting](interesting.md)", for example containing specific keywords, not containing JavaScript tags etc.
@ -32,7 +34,9 @@ Techniques of netstalking include port scanning, randomly generating web domains

 ## See Also

+- [fun](fun.md)
 - [www](www.md)
 - [Internet](internet.md)
 - [smol internet](smol_internet.md)
+- [article on neolurk](https://ru.wikipedia.org/wiki/%D0%9D%D0%B5%D1%82%D1%81%D1%82%D0%B0%D0%BB%D0%BA%D0%B8%D0%BD%D0%B3)