User:ReyBrujo/Dumps/20070228

Dumps

February 28, 2007

dumaen
dumaen

Articles with more than 5 external links as of February 28, 2007. Only articles in the main space are considered.

External
links
Article ID Article
72 528 Argentina
58 530 Venezuela
43 527 Uruguay
29 471 Augustine of Hippo
21 472 René Descartes
13 25 Dagupan City
13 480 Urdaneta City
12 477 Alaminos City
11 418 Agno, Pangasinan
11 53 Luyag na Pangasinan
11 484 San Fabian, Pangasinan
11 483 Mangaldan, Pangasinan
10 485 Bayambang, Pangasinan
10 419 Lingayen, Pangasinan
10 479 San Carlos City, Pangasinan
10 482 Malasiqui, Pangasinan
9 481 Aguilar, Pangasinan
9 2056 Bolinao, Pangasinan
9 2055 Bugallon, Pangasinan
9 72 Salitan Pangasinan
9 28 Diaryo
8 2054 Burgos, Pangasinan
8 526 Dayat Atlantic
7 33 Filipinas
7 457 Metro Manila
6 473 Akbar
6 2028 Villasis, Pangasinan
SELECT COUNT(el_from) AS total, el_from, page_title
FROM externallinks, page
WHERE externallinks.el_from = page_id AND page_is_redirect = 0 AND page_namespace = 0
GROUP BY el_from
ORDER BY total DESC;
dumaen

Sites linked more than 5 times as of February 28, 2007. Only articles in the main space are considered.

Link count Site
52 http://www.nscb.gov.ph
49 http://www.t-macs.com
19 http://kvaleberg.com
17 http://www.pangasinan.gov.ph
16 http://www.dalityapi.com
16 http://punch.dagupan.com
16 http://www.sunstar.com.ph
16 http://pangasinanstar.prepys.com
14 http://incubator.wikimedia.org
13 http://www.pasyalan.net
12 http://www.kingfisher.edu.ph
7 http://plato.stanford.edu
7 https://www.cia.gov
SELECT COUNT(el_to) AS total, SUBSTRING_INDEX(el_to, '/', 3) AS search
FROM externallinks, page
WHERE page_id = el_from AND page_namespace = 0
GROUP BY search
ORDER BY total DESC;

Additional information

dumaen

Some more information about this dump:

  • 382 articles that are in the main space and not redirects
  • 386 articles and redirects in the main space
  • 598 pages in all namespaces
  • 5 redirects in all namespaces
  • 573 external links in every namespace
  • 531 external links in the main space

Very probable spambot pages

dumaen

If index.php is found in a page title, it is very likely the article talk page has been created by a spambot. These pages should be deleted and protected if possible.

Article ID Article
2017 W/index.php

Possible spambot pages

dumaen

Possible pages created by spambots ending with /.

Article ID Article
SELECT page_id, page_title, page_namespace
FROM page
WHERE page_title LIKE '%index.php%' OR page_title LIKE '%/wiki/%' OR page_title LIKE '%/w/%' OR page_title LIKE '%/';