scylla.providers package

Submodules

scylla.providers.a2u_provider module

class scylla.providers.a2u_provider.A2uProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.base_provider module

class scylla.providers.base_provider.BaseProvider[source]

Bases: object

BaseProvider is the abstract class for the proxy providers

Raises:NotImplementedError – [if urls() or parse() is not implemented]
parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
sleep_seconds() → int[source]

Return a sleep time for each request, by default it is 0

Returns:sleep time in seconds
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.cool_proxy_provider module

class scylla.providers.cool_proxy_provider.CoolProxyProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.data5u_provider module

class scylla.providers.data5u_provider.Data5uProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.free_proxy_list_provider module

class scylla.providers.free_proxy_list_provider.FreeProxyListProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.http_proxy_provider module

class scylla.providers.http_proxy_provider.HttpProxyProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.kuaidaili_provider module

class scylla.providers.kuaidaili_provider.KuaidailiProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.spys_me_provider module

class scylla.providers.spys_me_provider.SpyMeProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.spys_one_provider module

class scylla.providers.spys_one_provider.SpysOneProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

scylla.providers.xici_provider module

class scylla.providers.xici_provider.XiciProvider[source]

Bases: scylla.providers.base_provider.BaseProvider

parse(html: requests_html.HTML) → [<class 'scylla.database.ProxyIP'>][source]

Parse the document in order to get a list of proxies

Parameters:html – the HTML object from requests-html
Returns:a list of proxy ips
static should_render_js() → bool[source]

Whether needs js rendering By default, it is False.

Returns:a boolean value indicating whether or not js rendering is needed
Return type:bool
urls() → [<class 'str'>][source]

Return a list of url strings for crawling

Returns:[a list of url strings]
Return type:[str]

Module contents