scylla package

Submodules

scylla.cli module

scylla.cli.app_main()[source]
scylla.cli.handle_special_flags(args: dict)[source]
scylla.cli.main(args) → int[source]

scylla.config module

scylla.config.batch_set_config(**kwargs)[source]
scylla.config.get_config(key: str, default: str = None) → Union[str, NoneType][source]
scylla.config.set_config(key: str, value: str)[source]

scylla.database module

class scylla.database.BaseModel(*args, **kwargs)[source]

Bases: playhouse.signals.Model

DoesNotExist

alias of BaseModelDoesNotExist

id = <peewee.AutoField object>
class scylla.database.ProxyIP(*args, **kwargs)[source]

Bases: scylla.database.BaseModel

DoesNotExist

alias of ProxyIPDoesNotExist

assign_from(p)[source]
attempts = <peewee.IntegerField object>
city = <peewee.CharField object>
country = <peewee.CharField object>
created_at = <peewee.DateTimeField object>
https_attempts = <peewee.IntegerField object>
id = <peewee.AutoField object>
ip = <peewee.CharField object>
is_anonymous = <peewee.BooleanField object>
is_https = <peewee.BooleanField object>
is_valid = <peewee.BooleanField object>
latency = <peewee.FloatField object>
location = <peewee.CharField object>
organization = <peewee.CharField object>
port = <peewee.IntegerField object>
region = <peewee.CharField object>
stability = <peewee.FloatField object>
updated_at = <peewee.DateTimeField object>
scylla.database.create_connection() → peewee.SqliteDatabase[source]

create a database connection :rtype: SqliteDatabase

scylla.database.create_db_tables()[source]
scylla.database.proxy_ip_on_pre_save_handler(model_class, instance: scylla.database.ProxyIP, created)[source]

scylla.jobs module

scylla.jobs.save_ip(p: scylla.database.ProxyIP)[source]
scylla.jobs.validate_proxy_ip(p: scylla.database.ProxyIP)[source]

scylla.loggings module

scylla.scheduler module

class scylla.scheduler.Scheduler[source]

Bases: object

feed_providers()[source]
join()[source]

Wait for worker processes and validator threads

start()[source]

Start the scheduler with processes for worker (fetching candidate proxies from different providers), and validator threads for checking whether the fetched proxies are able to use.

stop()[source]
scylla.scheduler.cron_schedule(scheduler, only_once=False)[source]
Parameters:
  • scheduler – the Scheduler instance
  • only_once – flag for testing
scylla.scheduler.fetch_ips(q: <bound method BaseContext.Queue of <multiprocessing.context.DefaultContext object at 0x7fb970d158d0>>, validator_queue: <bound method BaseContext.Queue of <multiprocessing.context.DefaultContext object at 0x7fb970d158d0>>)[source]
scylla.scheduler.validate_ips(validator_queue: <bound method BaseContext.Queue of <multiprocessing.context.DefaultContext object at 0x7fb970d158d0>>, validator_pool: concurrent.futures.thread.ThreadPoolExecutor)[source]

scylla.tcpping module

class scylla.tcpping.Ping(host: str, port: int, timeout=1)[source]

Bases: object

get_average() → float[source]
get_maximum() → float[source]
get_minimum() → float[source]
get_success_rate()[source]
ping(count=10, sleep=0.3)[source]
class scylla.tcpping.Socket(family, type_, timeout)[source]

Bases: object

close()[source]
connect(host, port=80)[source]
shutdown()[source]
class scylla.tcpping.Timer[source]

Bases: object

cost(funcs, args)[source]
start()[source]
stop()[source]
scylla.tcpping.avg(x)[source]
scylla.tcpping.ping(host: str, port: int, count: int = 10, sleep: float = 0.2) -> (<class 'int'>, <class 'float'>)[source]

Ping a server and port with tcp socket :param host: the hostname :param port: the port number :param count: number of connection tries, by default it is 10 :param sleep: length of sleep time in between sequent pings, by default it is 0.3 :return: a tuple for (average_latency, success_rate)

scylla.validation_policy module

class scylla.validation_policy.ValidationPolicy(proxy_ip: scylla.database.ProxyIP)[source]

Bases: object

ValidationPolicy will make decision about validating a proxy IP from the following aspects: 1. Whether or not to validate the proxy 2. Use http or https to validate the proxy

After 3 attempts, the validator should try no more attempts in 24 hours after its creation.

proxy_ip = None
should_try_https() → bool[source]
should_validate() → bool[source]

scylla.validator module

class scylla.validator.Validator(host: str, port: int, using_https: bool = False)[source]

Bases: object

anonymous
latency
meta
success_rate
using_https
valid
validate()[source]
validate_latency()[source]
validate_proxy()[source]
scylla.validator.get_current_ip()[source]

scylla.worker module

class scylla.worker.Worker[source]

Bases: object

get_html(url: str, render_js: bool = True) → Union[requests_html.HTML, NoneType][source]

Get html from a specific URL

Parameters:
  • url – the URL
  • render_js – [whether to render js], defaults to True
  • render_js – bool, optional
Returns:

[the HTML string]

Return type:

str

stop()[source]

Clean the session

Module contents