Common Crawl

nonprofit organization eponym of a large web periodic and open crawl

Organization dot_com_company Q12055316

Press Enter · cited answer in seconds

Common Crawl

Summary

Common Crawl is a dot-com company^[1]. It draws 2 Wikipedia views per month (dot_com_company category, ranking #94 of 220).^[2]

Key Facts

Common Crawl's field of work was web crawling^[3].
A notable work attributed to Common Crawl is CCBot^[4].
Common Crawl is in the country of United States^[5].
Common Crawl's instance of is recorded as dot-com company^[6].
Common Crawl's instance of is recorded as nonprofit organization^[7].
Common Crawl's founder is recorded as Gil Elbaz^[8].
Common Crawl's logo image is recorded as Common Crawl logo.svg^[9].
Common Crawl's language of work or name is recorded as English^[10].
Common Crawl's industry is recorded as publishing^[11].
Common Crawl's industry is recorded as data collection^[12].
+2008-00-00T00:00:00Z marks the founding of Common Crawl^[13].
Common Crawl's Freebase ID is recorded as /m/0rpgbk1^[14].
Common Crawl's official website is recorded as https://commoncrawl.org/^[15].
Common Crawl's IRS Employer Identification Number is recorded as 26-1635908^[16].
Common Crawl's official blog URL is recorded as https://commoncrawl.org/connect/blog/^[17].
Common Crawl's X is recorded as commoncrawl^[18].
Common Crawl's GitHub account is recorded as commoncrawl^[19].
Common Crawl's Crunchbase organization ID is recorded as common-crawl^[20].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+300000'}^[21].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+242567'}^[22].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+370769'}^[23].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+391366'}^[24].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+250396'}^[25].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+160010'}^[26].
Common Crawl's total revenue is recorded as {'unit': 'Q4917', 'amount': '+75014'}^[27].

Body

Founding

Common Crawl's founder is recorded as Gil Elbaz^[8]. +2008-00-00T00:00:00Z marks the founding of it^[13].

Industry

Industries include publishing^[11] and data collection^[12]. Common Crawl's field of work was web crawling^[3].

Why It Matters

Common Crawl draws 2 Wikipedia views per month (dot_com_company category, ranking #94 of 220).^[2] It has Wikipedia articles in 11 language editions, a strong signal of global cultural recognition.^[28] It is known by 4 alternative names across languages and contexts.^[29]

⭐ Popularity Graph

1/100 established

🌐 Wiki Languages

12 / 423 langs

🔗 Readers Also Explored · clickstream

web crawler

Wayback Machine

open-source artificial intelligence

Quick Facts

Instance of dot-com company, nonprofit organization

Field of work web crawling

Notable work CCBot

Inception 2008

Founder Gil Elbaz

Country United States

Language of work or name English

Official website commoncrawl.org

Country United States

Website commoncrawl.org

Industry publishing, data collection

Inception 2008

Founded by Gil Elbaz

Instance of dot-com company, nonprofit organization

Field of work web crawling

Properties

Faq url commoncrawl.org

Industry publishing, data collection

Notable work CCBot

Official blog url commoncrawl.org

Official jobs url commoncrawl.org

On focus list of wikimedia project Q117245199

Revenue 1.3M (75k–1.3M, n=10)

Social media followers 3638.0, 3565.0

Total assets 1.3M (173k–1.3M, n=10)

Total assets unit: Q4917, amount: +548181, unit: Q4917, amount: +475306, unit: Q4917, amount: +478733, unit: Q4917, amount: +452665, unit: Q4917, amount: +397645, unit: Q4917, amount: +226978, unit: Q4917, amount: +173454, unit: Q4917, amount: +352558, unit: Q4917, amount: +633865, unit: Q4917, amount: +1331529

Total revenue unit: Q4917, amount: +300000, unit: Q4917, amount: +242567, unit: Q4917, amount: +370769, unit: Q4917, amount: +391366, unit: Q4917, amount: +250396, unit: Q4917, amount: +160010, unit: Q4917, amount: +75014, unit: Q4917, amount: +330010, unit: Q4917, amount: +451447, unit: Q4917, amount: +1297813

External References (7)

Crunchbase organization id common-crawl

Freebase id /m/0rpgbk1

Github account commoncrawl

Irs employer identification number 26-1635908

Linkedin company or organization id common-crawl

Quora topic id Common-Crawl

X (twitter) username commoncrawl

🌐 Available in 11 languages

enCommon Crawl Foundation csCommon Crawl deCommon Crawl esCommon Crawl faکامن کرال frCommon Crawl jaコモン・クロール ruCommon Crawl svCommon Crawl thคอมมอนครอวล์ trCommon Crawl

🏷️ Also known as

en CommonCrawl en Common Crawl Foundation en commoncrawl.org es CommonCrawl

🔗 Connections

Industry 2

publishing, data collection

Instance of 2

nonprofit organization, dot-com company

Country 1

United States

Language of work or name 1

English

← Category contains 1

Category:Web archiving initiatives

References

Programmatic citations — every numbered marker resolves to a verifiable graph row below.

📑 Cite this page

Use these citations when quoting this entity in research, articles, AI prompts, or wherever provenance matters. We aggregate Wikidata + Wikipedia + authoritative open-data sources; the stitched, scored, cross-referenced view is what 4ort.xyz contributes.

APA 4ort.xyz Knowledge Graph. (2026). Common Crawl. Retrieved April 10, 2026, from https://4ort.xyz/entity/common-crawl

MLA “Common Crawl.” 4ort.xyz Knowledge Graph, 4ort.xyz, 10 Apr. 2026, https://4ort.xyz/entity/common-crawl.

BibTeX

@misc{4ortxyz_common-crawl_2026, author = {{4ort.xyz Knowledge Graph}}, title = {{Common Crawl}}, year = {2026}, url = {https://4ort.xyz/entity/common-crawl}, note = {Accessed: 2026-04-10}}

LLM prompt

According to 4ort.xyz Knowledge Graph (aggregator of Wikidata, Wikipedia, and authoritative open-data sources): Common Crawl — https://4ort.xyz/entity/common-crawl (retrieved 2026-04-10)

Canonical URL: https://4ort.xyz/entity/common-crawl · Last refreshed: April 10, 2026

Common Crawl

Common Crawl

Summary

Key Facts

Body

Founding

Industry

Why It Matters

Related Entities

References

Direct Wikidata claims

Class ancestry

Aggregate / graph-position facts

📑 Cite this page