Methods
crawler() → {module:crawler/index~Crawler}
Utility that helps creating a crawler
crawler
async createOrUpdate(Model, objects, fields)
Parameters:
Name | Type | Description |
---|---|---|
Model |
module:sequelize~Model | |
objects |
* | |
fields |
* |
async getModels(sequelize)
Parameters:
Name | Type | Description |
---|---|---|
sequelize |
module:sequelize~Sequelize |
async launch(profile, options) → {Promise.<TPuppeteer>}
Launch the browser
Parameters:
Name | Type | Description |
---|---|---|
profile |
Profile | |
options |
object |
the browser
async launchWithChromiumPuppeteer() → {module:puppeteer~Browser}
async launchWithFirefoxPuppeteer() → {module:puppeteer~Browser}
async logger(browser, resultPath, modelsopt) → {module:logger/index~Logger}
This function will create a logger that will listen and record to requests and responses and let you access to the browsers cookies. In order
Parameters:
Name | Type | Attributes | Description |
---|---|---|---|
browser |
object | ||
resultPath |
string | ||
models |
Record.<String, module:logger/index~ModelDescription> |
<optional> |
Example
let profile = await atrica.profile({browser: "firefox"});
let browser = await atrica.launch(profile);
let logger = await atrica.logger(browser, {resultPath: "./results"});
let session = await logger.makeSession
let page = await browser.newPage();
requestHandler(tab)
Parameters:
Name | Type | Description |
---|---|---|
tab |
module:puppeteer~Page | |
|
async setupAtrica(browser)
Parameters:
Name | Type | Description |
---|---|---|
browser |
module:puppeteer~Browser |
Type Definitions
ClearBrowserOptions
Properties:
Name | Type | Attributes | Description |
---|---|---|---|
appcache |
boolean |
<optional> |
|
cache |
boolean |
<optional> |
|
cacheStorage |
boolean |
<optional> |
|
cookies |
boolean |
<optional> |
|
downloads |
boolean |
<optional> |
|
fileSystems |
boolean |
<optional> |
|
formData |
boolean |
<optional> |
|
history |
boolean |
<optional> |
|
indexedDB |
boolean |
<optional> |
|
localStorage |
boolean |
<optional> |
|
pluginData |
boolean |
<optional> |
|
passwords |
boolean |
<optional> |
|
serviceWorkers |
boolean |
<optional> |
|
webSQL |
boolean |
<optional> |
Instruction
Properties:
Name | Type | Attributes | Description |
---|---|---|---|
id |
number | The ID of the instruction |
|
type |
string | The type/name of the instruction |
|
params |
object |
<optional> |
The parameters of the instruction |
InstructionResult
Properties:
Name | Type | Attributes | Description |
---|---|---|---|
id |
number | The ID of the instruction |
|
result |
object |
<optional> |
The result of the instruction |
LoggerContext
Properties:
Name | Type | Description |
---|---|---|
|
||
page |
module:puppeteer~Page | |
models |
Record.<string, Class.<module:sequelize~Model>> | |
logger |
module:logger/index~Logger |
ProfileOptions
Properties:
Name | Type | Description |
---|---|---|
browser |
'firefox' | 'chromium' | |
extensions |
Array.<string> | paths of the extensions to load
Please note that the extensions will only be loaded temporarly
and will NOT be installed in the profile. This is only possible for firefox
with |
path |
string | path where the profile will be created or loaded from |
name |
string | name of the profile |
binary |
string | browser's binary path |
options |
object | puppteer options |
env |
object | Environment variables |
Request
Properties:
Name | Type | Attributes | Description |
---|---|---|---|
url |
string | the url of the request |
|
type |
string | (script|image|...) the type of the request |
|
requestId |
number | The unique ID of the request |
|
method |
string | (GET|POST|PUT|DELETE|...) |
|
body |
object | Body of the request (POST parameters for instance) |
|
headers |
object | headers of the request |
|
response |
Response |
<optional> |
the response to the request |
Response
Properties:
Name | Type | Attributes | Description |
---|---|---|---|
headers |
object | The headers of the response |
|
bodySize |
number | The size of the body |
|
base64Encoded |
boolean |
<optional> |
Is the body encoded as base64 string ? |
body |
string |
<optional> |
The content of the request body |