Scylla: An Intelligent Proxy Pool for Humanities™¶
An intelligent proxy pool for humanities, only supports Python 3.6. Key features:
- Automatic proxy ip crawling and validation
- Easy-to-use JSON API
- Simple but beautiful web-based user interface (eg. geographical distribution of proxies)
- Get started with only 1 command minimally
- Straightforward programmable API
- Headless browser crawling
对于偏好中文的用户,请阅读 中文文档。For those who prefer to use Chinese, please read the Chinese Documentation
Get started¶
Installation¶
Install with Docker (highly recommended)¶
docker run -d -p 8899:8899 -v /var/www/scylla:/var/www/scylla --name scylla wildcat/scylla:latest
Install directly via pip¶
pip install scylla
scylla --help
scylla # Run the cralwer and web server for JSON API
Install from source¶
git clone https://github.com/imWildCat/scylla.git
cd scylla
pip install -r requirements.txt
npm install # or yarn install
make build-assets
python -m scylla
Usage¶
This is an example of running a service locally (localhost
), using port 8899
.
Note: You might have to wait for 1 to 2 minutes in order to wait for the crawler get some proxy ips for the first time you use Scylla.
JSON API¶
Proxy IP List¶
http://localhost:8899/api/v1/proxies
Optional URL parameters:
Parameters | Default value | Description |
---|---|---|
page | 1 |
The page number |
limit | 20 |
The number of proxies shown on each page |
anonymous | any |
Show anonymous proxies or not. Possible values:true , only anonymous proxies; false , only transparent proxies |
Sample result:
{
"proxies": [{
"id": 3661,
"ip": "118.114.77.47",
"port": 8080,
"is_valid": true,
"created_at": 1527312259,
"updated_at": 1527351023,
"latency": 250.9789636882,
"stability": 1.0,
"is_anonymous": true,
"location": "29.3416,104.7770",
"organization": "AS4134 CHINANET-BACKBONE",
"region": "Sichuan",
"country": "CN",
"city": "Zigong"
}, {
"id": 3657,
"ip": "39.104.57.121",
"port": 8080,
"is_valid": true,
"created_at": 1527312253,
"updated_at": 1527351021,
"latency": 189.1011954867,
"stability": 0.2,
"is_anonymous": true,
"location": null,
"organization": null,
"region": null,
"country": null,
"city": null
},
...
],
"count": 1025,
"per_page": 20,
"page": 1,
"total_page": 52
}
System Statistics¶
http://localhost:8899/api/v1/stats
Sample result:
{
"median": 181.2566407083,
"valid_count": 1780,
"total_count": 9528,
"mean": 174.3290085201
}
API Documentation¶
Please read Module Index.
Development and Contribution¶
git clone https://github.com/imWildCat/scylla.git
cd scylla
pip install -r requirements.txt
npm install # or `yarn install`
make build-assets
Testing¶
If you wish to run tests locally, the commands are shown below:
pip install -r tests/requirements-test.txt
pytest -n 15 tests
You are welcomed to add more test cases to this project, increasing the robustness of this project.
Naming of This Project¶
Scylla is derived from the name of a group of memory chips in the American TV series, Prison Break. This project was named after this American TV series to pay tribute to it.