Script Environment
Understanding the execution environment and available tools in your Easy Scrape API scripts.
The examples below are shown in JavaScript format for readability. When sending a script to the Easy Scrape API, the format requirements depend on your HTTP client:
- Most HTTP clients (Postman, cURL, fetch, etc.): Provide the script as a string inside the
scriptfield of your JSON request - Some platforms (certain no-code tools): May require the JavaScript code unescaped directly in the field
If you need help converting your JavaScript code to a properly escaped string, check out our Online Code Parser Tool to easily format your code for the API.
Available Objects
Your JavaScript code runs in a Node.js environment with access to these pre-loaded objects:
page - Puppeteer v24 Page Instance
Full Puppeteer v24 Page object for browser automation:
// Interact with elements
await page.click('button');
await page.type('#input', 'text');
await page.select('select#colors', 'blue');
// Wait for elements or states
await page.waitForSelector('.element');
await page.waitForLoadState('networkidle');
await page.waitForTimeout(1000);
cheerio - Server-side jQuery
HTML parsing and manipulation library:
// Get page HTML first
const html = await page.content();
// Load with Cheerio
const $ = cheerio.load(html);
// jQuery-like selection and manipulation
const title = $('h1').text();
const links = $('a').map((i, el) => $(el).attr('href')).get();
const data = $('.item').map((i, el) => ({
name: $(el).find('.name').text(),
price: $(el).find('.price').text()
})).get();
console.log() - Debug Logging
Output debug information that appears in the API response:
console.log('Starting scrape process...');
console.log('Found', items.length, 'items');
console.log('Extracted data:', JSON.stringify(data, null, 2));
Script Execution Flow
- Environment Setup: Puppeteer v24 browser launches automatically
- Script Execution: Your JavaScript code runs with full access to all objects
- Return Processing: Your return value becomes the API response data
- Cleanup: Browser closes automatically
Return Value Requirements
Remember that your script always returns something. If you don't explicitly return a value, your script will return undefined. Always include a return statement with meaningful data.
Your script must return a value that can be serialized to JSON:
Valid Return Types
// Object
return { title: 'Page Title', count: 5 };
// Array
return ['item1', 'item2', 'item3'];
// String
return 'Simple text result';
// Number
return 42;
// Boolean
return true;
// HTML (for HTML output format)
return '<div><h1>Results</h1><p>Content</p></div>';
Invalid Return Types
// Don't return functions
return () => console.log('test'); // ❌
// Don't return Puppeteer v24 objects
return page; // ❌
Be careful when using line comments (//) in your scripts. If you unescape the code and newlines (\n) are not properly handled, line comments can cause the rest of your code to be commented out and break your script. Use this feature carefully and test your scripts thoroughly.
Environment Limitations
Memory
- Reasonable limits for typical scraping tasks
- Large data sets should be processed incrementally
Network
- Outbound HTTP/HTTPS requests allowed
- No access to localhost or internal networks
- Standard web ports (80, 443) supported
Best Practices
Performance Optimization
// Wait for specific content instead of arbitrary timeouts
await page.waitForSelector('.content', { timeout: 10000 });
// Use networkidle for dynamic content
await page.waitForNetworkIdle();
// Extract only what you need
const data = await page.$$eval('.item', items =>
items.slice(0, 100).map(item => ({ // Limit to first 100 items
title: item.querySelector('.title')?.textContent?.trim(),
price: item.querySelector('.price')?.textContent?.trim()
}))
);
Debugging
Use console.log strategically:
console.log('Step 1: Waiting for content...');
await page.waitForSelector('.content');
console.log('Step 2: Extracting data...');
const items = await page.$$('.item');
console.log(`Found ${items.length} items`);
console.log('Step 3: Processing items...');
// Processing logic here
Common Patterns
Dynamic Content Loading
// Wait for AJAX content
await page.waitForFunction(() => {
return document.querySelectorAll('.dynamic-item').length > 0;
});
// Wait for specific text to appear
await page.waitForFunction(() => {
return document.body.textContent.includes('Loading complete');
});
Form Interactions
// Fill out forms
await page.type('#username', 'testuser');
await page.type('#password', 'password');
await page.click('#login-button');
// Handle dropdowns
await page.select('#country', 'US');
- Scripts run in an isolated environment for security
- No access to file system or external databases
- All network requests are logged and monitored
- Malicious activities will result in API access termination
Example: Complete Scraping Script
// Navigate and wait for content
await page.waitForSelector('h1');
// Extract data using both Puppeteer v24 and Cheerio
const pageData = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
timestamp: new Date().toISOString()
}));
// Get HTML for Cheerio processing
const html = await page.content();
const $ = cheerio.load(html);
// Process with Cheerio
const links = $('a').map((i, el) => ({
text: $(el).text().trim(),
href: $(el).attr('href')
})).get().slice(0, 10);
console.log(`Extracted ${links.length} links`);
return {
...pageData,
links
};