linkhut

Sort by:

Order:

6 days ago

A Secret Web

https://blog.clew.se/posts/secret-web/

All that to say, a search engine cannot be your sole source of information and discovery. Its strength is in helping you find specific things when you need them, but for a well-rounded information gathering experience, we all need to put more faith into other discovery methods, especially for the independent, secret web.

by durian 5 days ago saved 8 times

Tags:

08 Sep 25

The Search Engine Map

https://www.searchenginemap.com/

Here you will find all search engines which offer English-language results that the world has to offer, what type of search engine they are and where they get their organic results from.

by cos 9 months ago saved 5 times

Tags:

www.marginalia.nu | marginalia.nu

https://www.marginalia.nu/

by cos 9 months ago

Tags:

Marginalia Similar Website Finder

https://explore2.marginalia.nu/

In plain English, this service looks at which websites link to a particular target website, and then it ranks websites that are popular among those linking websites using a method commonly used in recommendation algorithms.

In technical jargon, it reinterprets the incident edges in the adjacency matrix as sparse high dimensional vector, and uses cosine similarity to find the nearest neighbors nodes within this feature-space.

by cos 9 months ago

Tags:

Creepy Website Similarity | marginalia.nu

https://www.marginalia.nu/log/69-creepy-website-similarity/

This is a write-up about an experiment from a few months ago, in how to find websites that are similar to each other. Website similarity is useful for many things, including discovering new websites to crawl, as well as suggesting similar websites in the Marginalia Search random exploration mode.

by cos 9 months ago

Tags:

Clew

https://clew.se/

Yet another independent search engine for the Small Web.

by cos 9 months ago saved 5 times

Tags:

18 Aug 25

Stract

https://stract.com/

EUが支援する非営利でOpen Sourceな検索エンジン

by hoagecko 10 months ago saved 3 times

Tags:

Brave Search

https://search.brave.com/

Brave Browser開発元による検索エンジン。GAFAM外のアメリカ企業の検索エンジンとして唯一日本語に対応している

by hoagecko 10 months ago

Tags:

17 Aug 25

版元ドットコム

https://www.hanmoto.com/

書影の利用確認ができることが特徴の本のためのdatabase

by hoagecko 10 months ago

Tags:

10 Aug 25

Lemmy Explorer

https://lemmyverse.net/?order=active

Lemmy Instance, Lemmy Communityの検索エンジン

by hoagecko 10 months ago

Tags:

08 Aug 25

生成AI系のUser-Agentまとめ

https://peketter-tech.net/develop/user-agent-of-generative-ai-others/

使わせてもらいました！ありがとう！

by hoagecko 10 months ago

Tags:

07 Aug 25

読売新聞、米AI新興Perplexityを提訴　検索サービスで著作権侵害 - 日本経済新聞

https://www.nikkei.com/article/DGXZQOUE10AM60Q4A211C2000000/

https://www.yomiuri.co.jp/national/20250807-OYT1T50151/ 読売新聞の本件についての記事

by hoagecko 10 months ago

Tags:

読売新聞社、「記事無断利用」生成ＡＩ企業を提訴…日本の大手報道機関で初

https://www.yomiuri.co.jp/national/20250807-OYT1T50151/

そりゃ2005年にGoogle Newsモドキを訴訟[平成17(ネ)10049]した読売新聞がrobots.txtすら守らないAI企業を許せるわけがないよね

by hoagecko 10 months ago

Tags:

Googleブック検索、米裁判の和解が日本の著作権者にも影響

https://internet.watch.impress.co.jp/cda/news/2009/02/25/22572.html

そりゃ出た時は揉めただろうな

by hoagecko 10 months ago

Tags:

06 Aug 25

Mastodon v4.2とFedibirdの検索文字列（早見表） - noellabo's tech blog

https://blog.noellabo.jp/entry/fedibird-advanced-search

fedibird管理人によるMastodonの検索書式の早見表。もちろんfedibird独自の機能も詳解

by hoagecko 10 months ago

Tags:

05 Aug 25

Perplexityはブロックされたサイトを「ステルスクローリング」している──Cloudflareが告発

https://www.itmedia.co.jp/news/articles/2508/05/news055.html

by hoagecko 10 months ago

Tags:

04 Aug 25

AI検索エンジンPerplexityによる掟破りのステルス・スクレイピング疑惑をCloudflareが告発：Webサイト運営者のブロックを巧妙に回避 | XenoSpectrum

https://xenospectrum.com/cloudflare-accuses-ai-search-engine-perplexity-of-rule-breaking-stealth-scraping/

“TollBitの調査では、サイトへのアクセス1回あたりのスクレイピング回数は、Perplexityが369回、Anthropicに至っては8692回にものぼる”

by hoagecko 10 months ago

Tags:

Perplexity Pro | スマートフォン・携帯電話 | ソフトバンク

https://www.softbank.jp/mobile/service/perplexity-ai/

robots.txtを顧みないサービスをビジネスに組み込むとは、さすがネトランを発行していた企業ではある

by hoagecko 10 months ago

Tags:

Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives

https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/

clouldflareは大嫌いだけど、このrobots.txtのためのイナゴAIとの戦いはめっちゃ応援してる

by hoagecko 10 months ago

Tags:

プーチン氏を批判するサイト検索で罰金ロシアで法律成立

https://www.afpbb.com/articles/-/3591502

by hoagecko 10 months ago

Tags: