{% extends "bucket/templates/base.html" %} {% block content %}

Mass FTP Crawling

By dsc - August 2015

The combination of interesting files one can find on public FTP servers plus the technical expertise required to make a decent search engine motivated me to write Findex and ultimately this article.


This is an old issue

This article is by no means presented as 'new'. However, given the fact I was still able to collect enormous amount of private files I'd say this deserves some attention.


Scanning

Disclaimer

Warning:
  • Alltough not illegal, mass portscanning is a great way to get kicked off your ISP. I would not recommend doing that from your home connection.
  • Indexing/crawling many FTP servers might also not be to the liking of your ISP.
  • Traversing public FTP servers is not illegal, however, it is the reader's responsibility to obey all applicable local, state and federal laws.
  • I indexed the files, not downloaded them.
  • This article is presented as research. Nothing more.


I decided to concentrate on public FTP servers located in my own country, The Netherlands.

I used a list of ip blocks belonging to Dutch internet providers and started my scan. Due to the fact Findex can do distributed scans and crawls it only took half a day.

It resulted in 257.807 discovered FTP servers of which 7578 required no form of authentication. I filtered the list of servers that did not contain any files and got to 2359 public FTP servers. From those I was able to discover 18.088.392 files, a little over 18 million.

I now had indexed every single file stored on a public FTP server located in The Netherlands.

  • 257.807 FTP's
  • 7578 public FTP's - 2.9%
  • 2359 public FTP's containing files - 0.9%
  • 18.088.392 files
  • 438.994 terabyte

I forgot to look at write permissions, so unfortunately I do not have these statistics for you.



Domains - Top 13

# Domain Public FTP Servers
1. ziggo.nl 397
2. chello.nl 153
3. direct-adsl.nl 95
4. alphamegahosting.com 98
5. xs4all 90
6. kpn.net 76
7. planet.nl 73
8. zeelandnet.nl 50
9. caiway.nl 46
10. hetnet.nl 46
11. telfortglasvezel.nl 42
12. upc.nl 41
13. ziggozakelijk.nl 32


All entries in the list are dutch ISPs except for the 4th place which seems to be a hosting company. Servers there probably come with a default public FTP account.



Location

I did a lookup on all the IP addresses and figured out their physical locations.


The province of 'Drenthe' has the lowest amount of public FTP servers. This is probably due to a low population density. 'Noord-Holland' has the highest amount, which also reflects the province's population density.


File categories

The following table shows the distribution of file categories.

SELECT file_format,sum(file_size),count(*) FROM files WHERE file_isdir != TRUE GROUP BY file_format;

Category Files Percentage Size
Documents 1997107
11%
1.7 TB
Movies 282306
1.5%
75.3 TB
Music 1046560
5%
8.6 TB
Pictures 5175332
28%
8.8 TB
Unidentified Files 9587087
53%
344 TB

Surprisingly, 28% of all the files collected were pictures (5 million!).

I searched a bit through the data and concluded that most pictures were photographs.



Most of the photos I found were part of a collection. This means that people use their public FTP server as a backup device for their personal photographs.



File Extensions - Top 10

The following table shows the 10 most popular file extensions.

SELECT file_ext, count(*) FROM files WHERE file_isdir != TRUE GROUP BY file_ext ORDER BY count(*) DESC LIMIT 10;

# Extension Files
1. .jpg 4.114.712
2. .deb 2.039.029
3. .mp3 869.530
4. .pdf 720.040
5. .png 577.334
6. .rpm 550.756
7. .gz 466.525
8. .html 336.627
9. .txt 250.380
10. .dsc 195.674



Sensitive Files

And now for the more juicy stuff... Sensitive files can be found by searching for them.

SELECT count(*) from FILES WHERE file_isdir != True AND file_format=1 AND searchable like 'keyword%';

Keyword Files Description
'wachtwoord' and 'password' 396 'wachtwoord' means 'password' in Dutch. Text documents came up with lists of passwords
passport 192 Images and documents of passports
belastingaangifte 517 'belastingaangifte' means 'tax return' in Dutch. Tax documents came up.
'factuur' and 'invoice' 4544 'factuur' means 'invoice' in Dutch. A lot of invoices came up.
creditcard 139 Photos and documents of creditcards
gemeente 614 'gemeente' means 'local authority' in Dutch. Goverment related documens came up.
wp-config.php 32 Configuration file for Wordpress
configuration.php 61 Configuration file for Joomla
config.php 428 Configuration files for various other web applications
passwd 82 Information file about users on unix systems

I viewed a few of the files and they were indeed what the filenames depicted.



The most sensitive files I found were documents belonging to a certain court, which described in detail information about court hearings and cases and personal information about the people involved (judges, defendants, lawyers, etc).

There was also a lot of documents belonging to a company that does 'property valuation'. There were floor plans, prices and other stuff of universities, police stations and big companies.


Responsible Disclosure

I will not publish any of these documents or pictures. But I will also not notify the affected parties in question for the following 2 reasons:

  • Retaliation
  • Too many hosts

I've already been kicked off my ISP once for responsibly alerting someone on a vulnerability and I can tell you it is not fun. Also, I'd say more than half of all the public FTP servers I was able to gather were public by accident and exposing sensitive files. This would mean I'd have to warn 2500+ people or companies. ;D


Conclusion

Many public FTP servers on the internet are still hosting sensitive files, in the year 2015. I had the ability to download a wide variety of sensitive documents and most surely other people are doing this too.


{% endblock %}