Skip to main content

Script Environment

Understanding the execution environment and available tools in your Easy Scrape API scripts.

warning

The examples below are shown in JavaScript format for readability. When sending a script to the Easy Scrape API, the format requirements depend on your HTTP client:

  • Most HTTP clients (Postman, cURL, fetch, etc.): Provide the script as a string inside the script field of your JSON request
  • Some platforms (certain no-code tools): May require the JavaScript code unescaped directly in the field

If you need help converting your JavaScript code to a properly escaped string, check out our Online Code Parser Tool to easily format your code for the API.

Available Objects

Your JavaScript code runs in a Node.js environment with access to these pre-loaded objects:

page - Puppeteer v24 Page Instance

Full Puppeteer v24 Page object for browser automation:

// Interact with elements
await page.click('button');
await page.type('#input', 'text');
await page.select('select#colors', 'blue');



// Wait for elements or states
await page.waitForSelector('.element');
await page.waitForLoadState('networkidle');
await page.waitForTimeout(1000);

cheerio - Server-side jQuery

HTML parsing and manipulation library:

// Get page HTML first
const html = await page.content();

// Load with Cheerio
const $ = cheerio.load(html);

// jQuery-like selection and manipulation
const title = $('h1').text();
const links = $('a').map((i, el) => $(el).attr('href')).get();
const data = $('.item').map((i, el) => ({
name: $(el).find('.name').text(),
price: $(el).find('.price').text()
})).get();

console.log() - Debug Logging

Output debug information that appears in the API response:

console.log('Starting scrape process...');
console.log('Found', items.length, 'items');
console.log('Extracted data:', JSON.stringify(data, null, 2));

Script Execution Flow

  1. Environment Setup: Puppeteer v24 browser launches automatically
  2. Script Execution: Your JavaScript code runs with full access to all objects
  3. Return Processing: Your return value becomes the API response data
  4. Cleanup: Browser closes automatically

Return Value Requirements

tip

Remember that your script always returns something. If you don't explicitly return a value, your script will return undefined. Always include a return statement with meaningful data.

Your script must return a value that can be serialized to JSON:

Valid Return Types

// Object
return { title: 'Page Title', count: 5 };

// Array
return ['item1', 'item2', 'item3'];

// String
return 'Simple text result';

// Number
return 42;

// Boolean
return true;

// HTML (for HTML output format)
return '<div><h1>Results</h1><p>Content</p></div>';

Invalid Return Types

// Don't return functions
return () => console.log('test'); // ❌
// Don't return Puppeteer v24 objects
return page; // ❌
Line Comments Caution

Be careful when using line comments (//) in your scripts. If you unescape the code and newlines (\n) are not properly handled, line comments can cause the rest of your code to be commented out and break your script. Use this feature carefully and test your scripts thoroughly.

Environment Limitations

Memory

  • Reasonable limits for typical scraping tasks
  • Large data sets should be processed incrementally

Network

  • Outbound HTTP/HTTPS requests allowed
  • No access to localhost or internal networks
  • Standard web ports (80, 443) supported

Best Practices

Performance Optimization

// Wait for specific content instead of arbitrary timeouts
await page.waitForSelector('.content', { timeout: 10000 });

// Use networkidle for dynamic content
await page.waitForNetworkIdle();


// Extract only what you need
const data = await page.$$eval('.item', items =>
items.slice(0, 100).map(item => ({ // Limit to first 100 items
title: item.querySelector('.title')?.textContent?.trim(),
price: item.querySelector('.price')?.textContent?.trim()
}))
);

Debugging

Use console.log strategically:

console.log('Step 1: Waiting for content...');
await page.waitForSelector('.content');

console.log('Step 2: Extracting data...');
const items = await page.$$('.item');
console.log(`Found ${items.length} items`);

console.log('Step 3: Processing items...');
// Processing logic here

Common Patterns

Dynamic Content Loading

// Wait for AJAX content
await page.waitForFunction(() => {
return document.querySelectorAll('.dynamic-item').length > 0;
});

// Wait for specific text to appear
await page.waitForFunction(() => {
return document.body.textContent.includes('Loading complete');
});

Form Interactions

// Fill out forms
await page.type('#username', 'testuser');
await page.type('#password', 'password');
await page.click('#login-button');
// Handle dropdowns
await page.select('#country', 'US');
Important Notes
  • Scripts run in an isolated environment for security
  • No access to file system or external databases
  • All network requests are logged and monitored
  • Malicious activities will result in API access termination

Example: Complete Scraping Script

// Navigate and wait for content
await page.waitForSelector('h1');

// Extract data using both Puppeteer v24 and Cheerio
const pageData = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
timestamp: new Date().toISOString()
}));

// Get HTML for Cheerio processing
const html = await page.content();
const $ = cheerio.load(html);

// Process with Cheerio
const links = $('a').map((i, el) => ({
text: $(el).text().trim(),
href: $(el).attr('href')
})).get().slice(0, 10);

console.log(`Extracted ${links.length} links`);

return {
...pageData,
links
};